Sierra Sergio, Ramo Rubén, Padilla Marc, Cobo Adolfo
Complutum Tecnologías de la Información Geográfica, COMPLUTIG, 28801, Alcalá de Henares, Spain.
Photonics Engineering Group, Universidad de Cantabria, 39005, Santander, Spain.
Environ Monit Assess. 2025 Mar 18;197(4):423. doi: 10.1007/s10661-025-13870-5.
This study presents an innovative approach to high-resolution land cover classification using deep learning, tackling the challenge of working with an exceptionally small dataset. Manual annotation of land cover data is both time-consuming and labor-intensive, making data augmentation crucial for enhancing model performance. While data augmentation is a well-established technique, there has not been a comprehensive and comparative evaluation of a wide range of data augmentation methods specifically applied to land cover classification until now. Our work fills this gap by systematically testing eight different data augmentation techniques across four neural networks (U-Net, DeepLabv3 + , FCN, PSPNet) using 25 cm resolution images from Cantabria, Spain. In total, we generated 19 distinct training sets and trained and validated 72 models. The results show that data augmentation can boost model performance by up to 30%. The best model (DeepLabV3 + with flip, contrast, and brightness adjustments) achieved an accuracy of 0.89 and an IoU of 0.78. Additionally, we utilized this optimized model to generate land cover maps for the years 2014, 2017, and 2019, validated at 580 samples selected based on a stratified sampling approach using CORINE Land Cover data, achieving an accuracy of 87.2%. This study not only provides a systematic ranking of data augmentation techniques for land cover classification but also offers a practical framework to help future researchers save time by identifying the most effective augmentation strategies for this specific task.
本研究提出了一种利用深度学习进行高分辨率土地覆盖分类的创新方法,以应对处理异常小数据集的挑战。土地覆盖数据的人工标注既耗时又费力,因此数据增强对于提高模型性能至关重要。虽然数据增强是一种成熟的技术,但直到现在,还没有对专门应用于土地覆盖分类的广泛数据增强方法进行全面的比较评估。我们的工作通过使用来自西班牙坎塔布里亚的25厘米分辨率图像,在四个神经网络(U-Net、DeepLabv3 +、FCN、PSPNet)上系统地测试八种不同的数据增强技术,填补了这一空白。我们总共生成了19个不同的训练集,并训练和验证了72个模型。结果表明,数据增强可以将模型性能提高30%。最佳模型(具有翻转、对比度和亮度调整的DeepLabV3 +)的准确率达到0.89,交并比达到0.78。此外,我们利用这个优化后的模型生成了2014年、2017年和2019年的土地覆盖图,并基于使用CORINE土地覆盖数据的分层抽样方法选择的580个样本进行验证,准确率达到87.2%。这项研究不仅为土地覆盖分类的数据增强技术提供了系统的排名,还提供了一个实用的框架,通过为这一特定任务识别最有效的增强策略,帮助未来的研究人员节省时间。