• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

具有全局-局部信息注意力的无人机图像实时语义分割

UAV Imagery Real-Time Semantic Segmentation with Global-Local Information Attention.

作者信息

Zhang Zikang, Li Gongquan

机构信息

School of Geosciences, Yangtze University, Wuhan 430100, China.

出版信息

Sensors (Basel). 2025 Mar 13;25(6):1786. doi: 10.3390/s25061786.

DOI:10.3390/s25061786
PMID:40292877
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11946199/
Abstract

In real-time semantic segmentation for drone imagery, current lightweight algorithms suffer from the lack of integration of global and local information in the image, leading to missed detections and misclassifications in the classification categories. This paper proposes a method for the real-time semantic segmentation of drones that integrates multi-scale global context information. The principle utilizes a UNet structure, with the encoder employing a Resnet18 network to extract features. The decoder incorporates a global-local attention module, where the global branch compresses and extracts global information in both vertical and horizontal directions, and the local branch extracts local information through convolution, thereby enhancing the fusion of global and local information in the image. In the segmentation head, a shallow-feature fusion module is used to multi-scale integrate the various features extracted by the encoder, thereby strengthening the spatial information in the shallow features. The model was tested on the UAvid and UDD6 datasets, achieving accuracies of 68% mIoU (mean Intersection over Union) and 67% mIoU on the two datasets, respectively, 10% and 21.2% higher than the baseline model UNet. The real-time performance of the model reached 72.4 frames/s, which is 54.4 frames/s higher than the baseline model UNet. The experimental results demonstrate that the proposed model balances accuracy and real-time performance well.

摘要

在无人机图像的实时语义分割中,当前的轻量级算法存在图像中全局和局部信息缺乏整合的问题,导致分类类别中出现漏检和误分类。本文提出了一种整合多尺度全局上下文信息的无人机实时语义分割方法。该方法利用UNet结构,编码器采用Resnet18网络提取特征。解码器包含一个全局-局部注意力模块,其中全局分支在垂直和水平方向上压缩并提取全局信息,局部分支通过卷积提取局部信息,从而增强图像中全局和局部信息的融合。在分割头中,使用浅特征融合模块对编码器提取的各种特征进行多尺度整合,从而强化浅特征中的空间信息。该模型在UAvid和UDD6数据集上进行了测试,在这两个数据集上分别实现了68%的平均交并比(mIoU)和67%的mIoU,比基线模型UNet分别高出10%和21.2%。该模型的实时性能达到72.4帧/秒,比基线模型UNet高出54.4帧/秒。实验结果表明,所提出的模型在准确性和实时性能之间取得了良好的平衡。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe24/11946199/c97c0de04528/sensors-25-01786-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe24/11946199/1067b2596c35/sensors-25-01786-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe24/11946199/7a08b5a169d4/sensors-25-01786-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe24/11946199/1e7a6359efc5/sensors-25-01786-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe24/11946199/1b577654e6d2/sensors-25-01786-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe24/11946199/3121325da172/sensors-25-01786-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe24/11946199/34b27278a56b/sensors-25-01786-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe24/11946199/0be62b2732dc/sensors-25-01786-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe24/11946199/b1e23259d62f/sensors-25-01786-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe24/11946199/1112229d4d5f/sensors-25-01786-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe24/11946199/c97c0de04528/sensors-25-01786-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe24/11946199/1067b2596c35/sensors-25-01786-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe24/11946199/7a08b5a169d4/sensors-25-01786-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe24/11946199/1e7a6359efc5/sensors-25-01786-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe24/11946199/1b577654e6d2/sensors-25-01786-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe24/11946199/3121325da172/sensors-25-01786-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe24/11946199/34b27278a56b/sensors-25-01786-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe24/11946199/0be62b2732dc/sensors-25-01786-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe24/11946199/b1e23259d62f/sensors-25-01786-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe24/11946199/1112229d4d5f/sensors-25-01786-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe24/11946199/c97c0de04528/sensors-25-01786-g010.jpg

相似文献

1
UAV Imagery Real-Time Semantic Segmentation with Global-Local Information Attention.具有全局-局部信息注意力的无人机图像实时语义分割
Sensors (Basel). 2025 Mar 13;25(6):1786. doi: 10.3390/s25061786.
2
FG-UNet: fine-grained feature-guided UNet for segmentation of weeds and crops in UAV images.FG-UNet:用于无人机图像中杂草和作物分割的细粒度特征引导UNet
Pest Manag Sci. 2025 Feb;81(2):856-866. doi: 10.1002/ps.8489. Epub 2024 Oct 17.
3
A Fast Attention-Guided Hierarchical Decoding Network for Real-Time Semantic Segmentation.一种用于实时语义分割的快速注意力引导分层解码网络。
Sensors (Basel). 2023 Dec 24;24(1):95. doi: 10.3390/s24010095.
4
A lightweight multi-dimension dynamic convolutional network for real-time semantic segmentation.一种用于实时语义分割的轻量级多维动态卷积网络。
Front Neurorobot. 2022 Dec 15;16:1075520. doi: 10.3389/fnbot.2022.1075520. eCollection 2022.
5
Bilateral attention decoder: A lightweight decoder for real-time semantic segmentation.双边注意解码器:用于实时语义分割的轻量级解码器。
Neural Netw. 2021 May;137:188-199. doi: 10.1016/j.neunet.2021.01.021. Epub 2021 Jan 30.
6
[Fully Automatic Glioma Segmentation Algorithm of Magnetic Resonance Imaging Based on 3D-UNet With More Global Contextual Feature Extraction: An Improvement on Insufficient Extraction of Global Features].基于具有更多全局上下文特征提取的3D-UNet的磁共振成像全自动胶质瘤分割算法:对全局特征提取不足的改进
Sichuan Da Xue Xue Bao Yi Xue Ban. 2024 Mar 20;55(2):447-454. doi: 10.12182/20240360208.
7
CvT-UNet: A weld pool segmentation method integrating a CNN and a transformer.CvT-UNet:一种融合卷积神经网络(CNN)和变换器(Transformer)的熔池分割方法。
Heliyon. 2024 Jul 16;10(15):e34738. doi: 10.1016/j.heliyon.2024.e34738. eCollection 2024 Aug 15.
8
Enhancing skin lesion segmentation with a fusion of convolutional neural networks and transformer models.通过融合卷积神经网络和Transformer模型增强皮肤病变分割
Heliyon. 2024 May 17;10(10):e31395. doi: 10.1016/j.heliyon.2024.e31395. eCollection 2024 May 30.
9
Rethinking 1D convolution for lightweight semantic segmentation.重新思考用于轻量级语义分割的一维卷积
Front Neurorobot. 2023 Feb 9;17:1119231. doi: 10.3389/fnbot.2023.1119231. eCollection 2023.
10
TransConver: transformer and convolution parallel network for developing automatic brain tumor segmentation in MRI images.TransConver:用于在MRI图像中开发自动脑肿瘤分割的变压器与卷积并行网络。
Quant Imaging Med Surg. 2022 Apr;12(4):2397-2415. doi: 10.21037/qims-21-919.

本文引用的文献

1
RescueNet: A High Resolution UAV Semantic Segmentation Dataset for Natural Disaster Damage Assessment.RescueNet:用于自然灾害损失评估的高分辨率无人机语义分割数据集
Sci Data. 2023 Dec 20;10(1):913. doi: 10.1038/s41597-023-02799-4.
2
CGNet: A Light-Weight Context Guided Network for Semantic Segmentation.CGNet:用于语义分割的轻量级上下文引导网络
IEEE Trans Image Process. 2021;30:1169-1179. doi: 10.1109/TIP.2020.3042065. Epub 2020 Dec 17.
3
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation.SegNet:一种用于图像分割的深度卷积编解码器架构。
IEEE Trans Pattern Anal Mach Intell. 2017 Dec;39(12):2481-2495. doi: 10.1109/TPAMI.2016.2644615. Epub 2017 Jan 2.