Meister, Martin, and John Qu. “Quantifying Seagrass Density Using Sentinel-2 Data and Machine Learning.” Remote Sensing 16 (March 27, 2024): 1165. Remote Sensing | Free Full-Text | Quantifying Seagrass Density Using Sentinel-2 Data and Machine Learning.
Abstract
Seagrasses, rooted aquatic plants growing completely underwater, are extremely important for the coastal ecosystem. They are an important component of the total carbon burial in the ocean, they provide food, shelter, and nursery to many aquatic organisms in coastal ecosystems, and they improve water quality. Due to human activity, seagrass coverage has been rapidly declining, and there is an urgent need to monitor seagrasses consistently. Seagrass coverage has been closely monitored in the Chesapeake Bay since 1970 using air photos and ground samples. These efforts are costly and time-consuming. Many studies have used remote sensing data to identify seagrass bed outlines, but few have mapped seagrass bed density. This study used Sentinel-2 satellite data and machine learning in Google Earth Engine and the Chesapeake Bay Program field data to map seagrass density. We used seagrass density data from the Chincoteague and Sinepuxent Bay to train machine learning algorithms and evaluate their accuracies. Out of the four machine learning models tested (Naive Bayes (NB), Classification and Regression Trees (CART), Support Vector Machine (SVM), and Random Forest (RF)), the RF model outperformed the other three models with overall accuracies of 0.874 and Kappa coefficients of 0.777. The SVM and CART models performed similarly and NB performed the poorest. We tested two different approaches to assess the models’ accuracy. When we used all the available ground samples to train the models, whereby our analysis showed that model performance was associated with seagrass density class, and that higher seagrass density classes had better consumer accuracy, producer accuracy, and F1 scores. However, the association of model performance with seagrass density class disappeared when using the same training data size for each class. Very sparse and dense seagrass classes had replacedhigherbetter accuracies than the sparse and moderate seagrass density classes. This finding suggests that training data impacts machine learning model performance. The uneven training data size for different classes can result in biased assessment results. Selecting proper training data and machine learning models are equally important when using machine learning and remote sensing data to map seagrass density. In summary, this study demonstrates the potential to map seagrass density using satellite data.