Savitha Chirasmayee, Talari Reshma
Department of Civil Engineering, National Institute of Technology Andhra Pradesh, Tadepalligudem, 534101, India.
Environ Monit Assess. 2025 Mar 19;197(4):437. doi: 10.1007/s10661-025-13880-3.
The development of machine learning algorithms, along with high-resolution satellite datasets, aids in improved agriculture monitoring and mapping. Nevertheless, the use of high-resolution optical satellite datasets is usually constrained by clouds and shadows, which do not capture complete crop phenology, thus limiting map accuracy. Moreover, the identification of a suitable classification algorithm is essential, as the performance of each machine learning algorithm depends on input datasets, hyperparameter tuning, training, and testing samples, among other factors. To overcome the limitation of clouds and shadow in optical data, this study employs Sentinel-2 greenest pixel composite to generate a nearly accurate crop-type map for an agricultural watershed in Tadepalligudem, India. To identify a suitable machine learning model, the study also evaluates and compares the performance of four machine learning algorithms: gradient tree boost, classification and regression tree, support vector machine, and random forest (RF). Crop-type maps are generated for two cropping seasons, Kharif and Rabi, in Google Earth Engine (GEE), a robust cloud computing platform. Further, to train and test these algorithms, ground truth data is collected and divided in the ratio of 70:30, for training and testing, respectively. The results of the study demonstrated the ability of the greenest pixel composite method to identify and map crop types in small watersheds even during the Kharif season. Further, among the four machine learning algorithms employed, RF is shown to outperform other classification algorithms in both Kharif and Rabi seasons, with an average overall accuracy of 93.21% and a kappa coefficient of 0.89. Furthermore, the study showcases the potential of the cloud computing platform GEE in enhancing automatic agricultural monitoring through satellite datasets while requiring minimal computational storage and processing.
机器学习算法的发展,连同高分辨率卫星数据集,有助于改进农业监测和制图。然而,高分辨率光学卫星数据集的使用通常受到云层和阴影的限制,这些云层和阴影无法捕捉完整的作物物候,从而限制了地图的准确性。此外,确定合适的分类算法至关重要,因为每种机器学习算法的性能取决于输入数据集、超参数调整、训练和测试样本等因素。为了克服光学数据中云层和阴影的限制,本研究采用哨兵 - 2最绿像素合成法,为印度塔德帕利古德姆的一个农业流域生成了一张近乎准确的作物类型图。为了确定合适的机器学习模型,该研究还评估并比较了四种机器学习算法的性能:梯度树提升、分类与回归树、支持向量机和随机森林(RF)。在强大的云计算平台谷歌地球引擎(GEE)中,针对两个种植季节(季风季和冬季)生成了作物类型图。此外,为了训练和测试这些算法,收集了地面真值数据,并分别按照70:30的比例划分为训练集和测试集。研究结果表明,即使在季风季,最绿像素合成法也能够识别和绘制小流域内的作物类型。此外,在所采用的四种机器学习算法中,随机森林在季风季和冬季均表现优于其他分类算法,平均总体准确率为93.21%,卡帕系数为0.89。此外,该研究展示了云计算平台谷歌地球引擎在通过卫星数据集增强自动农业监测方面的潜力,同时所需的计算存储和处理量最小。