• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

UNeXt:一种用于高分辨率遥感影像语义分割的高效网络。

UNeXt: An Efficient Network for the Semantic Segmentation of High-Resolution Remote Sensing Images.

作者信息

Chang Zhanyuan, Xu Mingyu, Wei Yuwen, Lian Jie, Zhang Chongming, Li Chuanjiang

机构信息

College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, Shanghai 200234, China.

出版信息

Sensors (Basel). 2024 Oct 16;24(20):6655. doi: 10.3390/s24206655.

DOI:10.3390/s24206655
PMID:39460135
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11510939/
Abstract

The application of deep neural networks for the semantic segmentation of remote sensing images is a significant research area within the field of the intelligent interpretation of remote sensing data. The semantic segmentation of remote sensing images holds great practical value in urban planning, disaster assessment, the estimation of carbon sinks, and other related fields. With the continuous advancement of remote sensing technology, the spatial resolution of remote sensing images is gradually increasing. This increase in resolution brings about challenges such as significant changes in the scale of ground objects, redundant information, and irregular shapes within remote sensing images. Current methods leverage Transformers to capture global long-range dependencies. However, the use of Transformers introduces higher computational complexity and is prone to losing local details. In this paper, we propose UNeXt (UNet+ConvNeXt+Transformer), a real-time semantic segmentation model tailored for high-resolution remote sensing images. To achieve efficient segmentation, UNeXt uses the lightweight ConvNeXt-T as the encoder and a lightweight decoder, Transnext, which combines a Transformer and CNN (Convolutional Neural Networks) to capture global information while avoiding the loss of local details. Furthermore, in order to more effectively utilize spatial and channel information, we propose a SCFB (SC Feature Fuse Block) to reduce computational complexity while enhancing the model's recognition of complex scenes. A series of ablation experiments and comprehensive comparative experiments demonstrate that our method not only runs faster than state-of-the-art (SOTA) lightweight models but also achieves higher accuracy. Specifically, our proposed UNeXt achieves 85.2% and 82.9% mIoUs on the Vaihingen and Gaofen5 (GID5) datasets, respectively, while maintaining 97 fps for 512 × 512 inputs on a single NVIDIA GTX 4090 GPU, outperforming other SOTA methods.

摘要

深度神经网络在遥感影像语义分割中的应用是遥感数据智能解译领域的一个重要研究方向。遥感影像语义分割在城市规划、灾害评估、碳汇估算等相关领域具有重要的实用价值。随着遥感技术的不断发展,遥感影像的空间分辨率逐渐提高。分辨率的提高给遥感影像带来了诸如地物尺度变化显著、信息冗余以及形状不规则等挑战。当前方法利用Transformer来捕捉全局长距离依赖关系。然而,Transformer的使用带来了更高的计算复杂度,并且容易丢失局部细节。在本文中,我们提出了UNeXt(UNet + ConvNeXt + Transformer),这是一种专门为高分辨率遥感影像设计的实时语义分割模型。为了实现高效分割,UNeXt使用轻量级的ConvNeXt - T作为编码器和一个轻量级解码器Transnext,它将Transformer和CNN(卷积神经网络)相结合,以捕捉全局信息,同时避免局部细节的丢失。此外,为了更有效地利用空间和通道信息,我们提出了一种SCFB(SC特征融合块)来降低计算复杂度,同时增强模型对复杂场景的识别能力。一系列消融实验和综合对比实验表明,我们的方法不仅比现有最先进的(SOTA)轻量级模型运行速度更快,而且精度更高。具体而言,我们提出的UNeXt在Vaihingen和高分五号(GID5)数据集上分别达到了85.2%和82.9%的平均交并比(mIoU),同时在单个NVIDIA GTX 4090 GPU上对于512×512输入保持97帧每秒的速度,优于其他SOTA方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/b70578645447/sensors-24-06655-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/3f3c058a2f0b/sensors-24-06655-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/3fa3bfceb473/sensors-24-06655-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/c43880f5dce4/sensors-24-06655-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/40c055d03c34/sensors-24-06655-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/39d88f445387/sensors-24-06655-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/845725c200d2/sensors-24-06655-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/42b61f7184a3/sensors-24-06655-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/30688c6aed04/sensors-24-06655-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/b3759b4c891b/sensors-24-06655-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/c0e042a750e2/sensors-24-06655-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/c41e2ee78e22/sensors-24-06655-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/b70578645447/sensors-24-06655-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/3f3c058a2f0b/sensors-24-06655-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/3fa3bfceb473/sensors-24-06655-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/c43880f5dce4/sensors-24-06655-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/40c055d03c34/sensors-24-06655-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/39d88f445387/sensors-24-06655-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/845725c200d2/sensors-24-06655-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/42b61f7184a3/sensors-24-06655-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/30688c6aed04/sensors-24-06655-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/b3759b4c891b/sensors-24-06655-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/c0e042a750e2/sensors-24-06655-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/c41e2ee78e22/sensors-24-06655-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0274/11510939/b70578645447/sensors-24-06655-g013.jpg

相似文献

1
UNeXt: An Efficient Network for the Semantic Segmentation of High-Resolution Remote Sensing Images.UNeXt:一种用于高分辨率遥感影像语义分割的高效网络。
Sensors (Basel). 2024 Oct 16;24(20):6655. doi: 10.3390/s24206655.
2
Asymmetric Network Combining CNN and Transformer for Building Extraction from Remote Sensing Images.用于从遥感图像中提取建筑物的结合卷积神经网络和变压器的非对称网络
Sensors (Basel). 2024 Sep 25;24(19):6198. doi: 10.3390/s24196198.
3
ETUNet:Exploring efficient transformer enhanced UNet for 3D brain tumor segmentation.ETUNet:探索高效的基于Transformer 的增强型 UNet 进行 3D 脑肿瘤分割。
Comput Biol Med. 2024 Mar;171:108005. doi: 10.1016/j.compbiomed.2024.108005. Epub 2024 Jan 23.
4
A Dual-Branch Fusion Network Based on Reconstructed Transformer for Building Extraction in Remote Sensing Imagery.一种基于重构变压器的双分支融合网络用于遥感影像中的建筑物提取
Sensors (Basel). 2024 Jan 7;24(2):365. doi: 10.3390/s24020365.
5
Transformer-Based Model with Dynamic Attention Pyramid Head for Semantic Segmentation of VHR Remote Sensing Imagery.基于Transformer且带有动态注意力金字塔头的甚高分辨率遥感影像语义分割模型
Entropy (Basel). 2022 Nov 6;24(11):1619. doi: 10.3390/e24111619.
6
Land Cover Classification of UAV Remote Sensing Based on Transformer-CNN Hybrid Architecture.基于 Transformer-CNN 混合架构的无人机遥感土地覆盖分类。
Sensors (Basel). 2023 Jun 2;23(11):5288. doi: 10.3390/s23115288.
7
Research on Ground Object Classification Method of High Resolution Remote-Sensing Images Based on Improved DeeplabV3.基于改进型 DeeplabV3 的高分辨率遥感图像地物分类方法研究
Sensors (Basel). 2022 Oct 2;22(19):7477. doi: 10.3390/s22197477.
8
TMNet: A Two-Branch Multi-Scale Semantic Segmentation Network for Remote Sensing Images.TMNet:一种用于遥感图像的两分支多尺度语义分割网络。
Sensors (Basel). 2023 Jun 26;23(13):5909. doi: 10.3390/s23135909.
9
A transformer-based approach empowered by a self-attention technique for semantic segmentation in remote sensing.一种基于自注意力技术的基于Transformer的方法用于遥感语义分割。
Heliyon. 2024 Apr 12;10(8):e29396. doi: 10.1016/j.heliyon.2024.e29396. eCollection 2024 Apr 30.
10
Transformer guided self-adaptive network for multi-scale skin lesion image segmentation.Transformer 引导的自适网络用于多尺度皮肤病变图像分割。
Comput Biol Med. 2024 Feb;169:107846. doi: 10.1016/j.compbiomed.2023.107846. Epub 2023 Dec 23.

本文引用的文献

1
Deep Multiview Union Learning Network for Multisource Image Classification.基于深度多视图联合学习网络的多源图像分类。
IEEE Trans Cybern. 2022 Jun;52(6):4534-4546. doi: 10.1109/TCYB.2020.3029787. Epub 2022 Jun 16.
2
UNet++: A Nested U-Net Architecture for Medical Image Segmentation.U-Net++:一种用于医学图像分割的嵌套U-Net架构。
Deep Learn Med Image Anal Multimodal Learn Clin Decis Support (2018). 2018 Sep;11045:3-11. doi: 10.1007/978-3-030-00889-5_1. Epub 2018 Sep 20.
3
Deep High-Resolution Representation Learning for Visual Recognition.
用于视觉识别的深度高分辨率表征学习
IEEE Trans Pattern Anal Mach Intell. 2021 Oct;43(10):3349-3364. doi: 10.1109/TPAMI.2020.2983686. Epub 2021 Sep 2.
4
Fully Convolutional Networks for Semantic Segmentation.全卷积网络用于语义分割。
IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):640-651. doi: 10.1109/TPAMI.2016.2572683. Epub 2016 May 24.
5
The generalized invariom database (GID).广义不变量数据库(GID)
Acta Crystallogr B Struct Sci Cryst Eng Mater. 2013 Apr;69(Pt 2):91-104. doi: 10.1107/S2052519213002285. Epub 2013 Mar 14.