• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

MFPI-Net:一种用于城市遥感影像语义分割的多尺度特征感知与交互网络

MFPI-Net: A Multi-Scale Feature Perception and Interaction Network for Semantic Segmentation of Urban Remote Sensing Images.

作者信息

Song Xiaofei, Chen Mingju, Rao Jie, Luo Yangming, Lin Zhihao, Zhang Xingyue, Li Senyuan, Hu Xiao

机构信息

School of Automation and Information Engineering, Sichuan University of Science and Engineering, Yibin 644005, China.

Intelligent Perception and Control Key Laboratory of Sichuan Province, Sichuan University of Science and Engineering, Yibin 644005, China.

出版信息

Sensors (Basel). 2025 Jul 27;25(15):4660. doi: 10.3390/s25154660.

DOI:10.3390/s25154660
PMID:40807825
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12349348/
Abstract

To improve semantic segmentation performance for complex urban remote sensing images with multi-scale object distribution, class similarity, and small object omission, this paper proposes MFPI-Net, an encoder-decoder-based semantic segmentation network. It includes four core modules: a Swin Transformer backbone encoder, a diverse dilation rates attention shuffle decoder (DDRASD), a multi-scale convolutional feature enhancement module (MCFEM), and a cross-path residual fusion module (CPRFM). The Swin Transformer efficiently extracts multi-level global semantic features through its hierarchical structure and window attention mechanism. The DDRASD's diverse dilation rates attention (DDRA) block combines convolutions with diverse dilation rates and channel-coordinate attention to enhance multi-scale contextual awareness, while Shuffle Block improves resolution via pixel rearrangement and avoids checkerboard artifacts. The MCFEM enhances local feature modeling through parallel multi-kernel convolutions, forming a complementary relationship with the Swin Transformer's global perception capability. The CPRFM employs multi-branch convolutions and a residual multiplication-addition fusion mechanism to enhance interactions among multi-source features, thereby improving the recognition of small objects and similar categories. Experiments on the ISPRS Vaihingen and Potsdam datasets show that MFPI-Net outperforms mainstream methods, achieving 82.57% and 88.49% mIoU, validating its superior segmentation performance in urban remote sensing.

摘要

为了提高复杂城市遥感图像的语义分割性能,以应对多尺度目标分布、类别相似性和小目标遗漏等问题,本文提出了MFPI-Net,一种基于编码器-解码器的语义分割网络。它包括四个核心模块:一个Swin Transformer主干编码器、一个多尺度扩张率注意力洗牌解码器(DDRASD)、一个多尺度卷积特征增强模块(MCFEM)和一个跨路径残差融合模块(CPRFM)。Swin Transformer通过其分层结构和窗口注意力机制有效地提取多级全局语义特征。DDRASD的多尺度扩张率注意力(DDRA)块将具有不同扩张率的卷积与通道坐标注意力相结合,以增强多尺度上下文感知,而洗牌块通过像素重排提高分辨率并避免棋盘效应。MCFEM通过并行多内核卷积增强局部特征建模,与Swin Transformer的全局感知能力形成互补关系。CPRFM采用多分支卷积和残差乘加融合机制来增强多源特征之间的交互,从而提高对小目标和相似类别的识别能力。在ISPRS Vaihingen和波茨坦数据集上的实验表明,MFPI-Net优于主流方法,分别达到了82.57%和88.49%的平均交并比,验证了其在城市遥感中的卓越分割性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/58d0f441e4e1/sensors-25-04660-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/0736b0f9208b/sensors-25-04660-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/1b9ccb2405a2/sensors-25-04660-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/ccc4a8790981/sensors-25-04660-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/c922952a1d61/sensors-25-04660-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/0c5549835df1/sensors-25-04660-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/ba0b02e27cfc/sensors-25-04660-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/f6fb1cc2c77c/sensors-25-04660-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/8ed4b62bb176/sensors-25-04660-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/5dfae06c2d12/sensors-25-04660-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/2b51161a0274/sensors-25-04660-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/76bd2a166b34/sensors-25-04660-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/61a2420a64b1/sensors-25-04660-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/58d0f441e4e1/sensors-25-04660-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/0736b0f9208b/sensors-25-04660-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/1b9ccb2405a2/sensors-25-04660-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/ccc4a8790981/sensors-25-04660-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/c922952a1d61/sensors-25-04660-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/0c5549835df1/sensors-25-04660-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/ba0b02e27cfc/sensors-25-04660-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/f6fb1cc2c77c/sensors-25-04660-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/8ed4b62bb176/sensors-25-04660-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/5dfae06c2d12/sensors-25-04660-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/2b51161a0274/sensors-25-04660-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/76bd2a166b34/sensors-25-04660-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/61a2420a64b1/sensors-25-04660-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcfd/12349348/58d0f441e4e1/sensors-25-04660-g013.jpg

相似文献

1
MFPI-Net: A Multi-Scale Feature Perception and Interaction Network for Semantic Segmentation of Urban Remote Sensing Images.MFPI-Net:一种用于城市遥感影像语义分割的多尺度特征感知与交互网络
Sensors (Basel). 2025 Jul 27;25(15):4660. doi: 10.3390/s25154660.
2
DGCFNet: Dual Global Context Fusion Network for remote sensing image semantic segmentation.DGCFNet:用于遥感图像语义分割的双全局上下文融合网络
PeerJ Comput Sci. 2025 Mar 27;11:e2786. doi: 10.7717/peerj-cs.2786. eCollection 2025.
3
Multi-level channel-spatial attention and light-weight scale-fusion network (MCSLF-Net): multi-level channel-spatial attention and light-weight scale-fusion transformer for 3D brain tumor segmentation.多级通道空间注意力与轻量级尺度融合网络(MCSLF-Net):用于3D脑肿瘤分割的多级通道空间注意力与轻量级尺度融合变换器
Quant Imaging Med Surg. 2025 Jul 1;15(7):6301-6325. doi: 10.21037/qims-2025-354. Epub 2025 Jun 30.
4
A novel image segmentation network with multi-scale and flow-guided attention for early screening of vaginal intraepithelial neoplasia (VAIN).一种用于阴道上皮内瘤变(VAIN)早期筛查的具有多尺度和流引导注意力的新型图像分割网络。
Med Phys. 2025 Aug;52(8):e18041. doi: 10.1002/mp.18041.
5
DCMC-UNet: A Novel Segmentation Model for Carbon Traces in Oil-Immersed Transformers Improved with Dynamic Feature Fusion and Adaptive Illumination Enhancement.DCMC-UNet:一种通过动态特征融合和自适应光照增强改进的油浸式变压器碳痕分割新模型。
Sensors (Basel). 2025 Jun 23;25(13):3904. doi: 10.3390/s25133904.
6
Liver Semantic Segmentation Method Based on Multi-Channel Feature Extraction and Cross Fusion.基于多通道特征提取与交叉融合的肝脏语义分割方法
Bioengineering (Basel). 2025 Jun 11;12(6):636. doi: 10.3390/bioengineering12060636.
7
DBRSNet: a dual-branch remote sensing image segmentation model based on feature interaction and multi-scale feature fusion.DBRSNet:一种基于特征交互和多尺度特征融合的双分支遥感图像分割模型。
Sci Rep. 2025 Jul 30;15(1):27786. doi: 10.1038/s41598-025-13236-4.
8
TLTNet: A novel transscale cascade layered transformer network for enhanced retinal blood vessel segmentation.TLTNet:一种新颖的跨尺度级联分层Transformer 网络,用于增强视网膜血管分割。
Comput Biol Med. 2024 Aug;178:108773. doi: 10.1016/j.compbiomed.2024.108773. Epub 2024 Jun 25.
9
VMDU-net: a dual encoder multi-scale fusion network for polyp segmentation with Vision Mamba and Cross-Shape Transformer integration.VMDU-net:一种用于息肉分割的双编码器多尺度融合网络,集成了视觉曼巴和十字形变换器
Front Artif Intell. 2025 Jun 18;8:1557508. doi: 10.3389/frai.2025.1557508. eCollection 2025.
10
Multi-scale fusion semantic enhancement network for medical image segmentation.用于医学图像分割的多尺度融合语义增强网络。
Sci Rep. 2025 Jul 2;15(1):23018. doi: 10.1038/s41598-025-07806-9.

本文引用的文献

1
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs.DeepLab:基于深度卷积网络、空洞卷积和全连接条件随机场的语义图像分割。
IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):834-848. doi: 10.1109/TPAMI.2017.2699184. Epub 2017 Apr 27.