轻量级深度学习模型ConvNeXt-U：一种用于从高分二号影像中提取复杂景观中农田的改进型U-Net网络。

Lightweight Deep Learning Model, ConvNeXt-U: An Improved U-Net Network for Extracting Cropland in Complex Landscapes from Gaofen-2 Images.

作者信息

Liu Shukuan, Cao Shi, Lu Xia, Peng Jiqing, Ping Lina, Fan Xiang, Teng Feiyu, Liu Xiangnan

机构信息

School of Information Engineering, China University of Geosciences, Beijing 100083, China.

The Second Surveying and Mapping Institute of Hunan Province, Changsha 410004, China.

出版信息

Sensors (Basel). 2025 Jan 5;25(1):261. doi: 10.3390/s25010261.

DOI:10.3390/s25010261

PMID:39797051

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11723271/

Abstract

Extracting fragmented cropland is essential for effective cropland management and sustainable agricultural development. However, extracting fragmented cropland presents significant challenges due to its irregular and blurred boundaries, as well as the diversity in crop types and distribution. Deep learning methods are widely used for land cover classification. This paper proposes ConvNeXt-U, a lightweight deep learning network that efficiently extracts fragmented cropland while reducing computational requirements and saving costs. ConvNeXt-U retains the U-shaped structure of U-Net but replaces the encoder with a simplified ConvNeXt architecture. The decoder remains unchanged from U-Net, and the lightweight CBAM (Convolutional Block Attention Module) is integrated. This module adaptively adjusts the channel and spatial dimensions of feature maps, emphasizing key features and suppressing redundant information, which enhances the capture of edge features and improves extraction accuracy. The case study area is Hengyang County, Hunan Province, China, using GF-2 remote sensing imagery. The results show that ConvNeXt-U outperforms existing methods, such as Swin Transformer (Acc = 85.1%, IoU = 79.1%), MobileNetV3 (Acc = 83.4%, IoU = 77.6%), VGG16 (Acc = 80.5%, IoU = 74.6%), and ResUnet (Acc = 81.8%, IoU = 76.1%), achieving an IoU of 79.5% and Acc of 85.2%. Under the same conditions, ConvNeXt-U has a faster inference speed of 37 images/s, compared to 28 images/s for Swin Transformer, 35 images/s for MobileNetV3, and 0.43 and 0.44 images/s for VGG16 and ResUnet, respectively. Moreover, ConvNeXt-U outperforms other methods in processing the boundaries of fragmented cropland, producing clearer and more complete boundaries. The results indicate that the ConvNeXt and CBAM modules significantly enhance the accuracy of fragmented cropland extraction. ConvNeXt-U is also an effective method for extracting fragmented cropland from remote sensing imagery.

摘要

提取破碎农田对于有效的农田管理和可持续农业发展至关重要。然而，由于其边界不规则且模糊，以及作物类型和分布的多样性，提取破碎农田面临重大挑战。深度学习方法被广泛用于土地覆盖分类。本文提出了ConvNeXt-U，一种轻量级深度学习网络，它能在降低计算需求和节省成本的同时有效提取破碎农田。ConvNeXt-U保留了U-Net的U形结构，但用简化的ConvNeXt架构替换了编码器。解码器与U-Net保持不变，并集成了轻量级的CBAM（卷积块注意力模块）。该模块自适应调整特征图的通道和空间维度，强调关键特征并抑制冗余信息，这增强了边缘特征的捕捉并提高了提取精度。案例研究区域为中国湖南省衡阳县，使用高分二号遥感影像。结果表明，ConvNeXt-U优于现有方法，如Swin Transformer（Acc = 85.1%，IoU = 79.1%）、MobileNetV3（Acc = 83.4%，IoU = 77.6%）、VGG16（Acc = 80.5%，IoU = 74.6%）和ResUnet（Acc = 81.8%，IoU = 76.1%），实现了79.5%的IoU和85.2%的Acc。在相同条件下，ConvNeXt-U的推理速度更快，为37张图像/秒，而Swin Transformer为28张图像/秒，MobileNetV3为3张图像/秒，VGG16和ResUnet分别为0.43和0.44张图像/秒。此外，ConvNeXt-U在处理破碎农田边界方面优于其他方法，生成的边界更清晰、更完整。结果表明，ConvNeXt和CBAM模块显著提高了破碎农田提取的准确性。ConvNeXt-U也是从遥感影像中提取破碎农田的有效方法。