• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

DSC-Net:使用双分支Swin-CNN架构通过视觉传感器增强盲道语义分割

DSC-Net: Enhancing Blind Road Semantic Segmentation with Visual Sensor Using a Dual-Branch Swin-CNN Architecture.

作者信息

Yuan Ying, Du Yu, Ma Yan, Lv Hejun

机构信息

Beijing Key Laboratory of Information Service Engineering, College of Robotics, Beijing Union University, Beijing 100101, China.

出版信息

Sensors (Basel). 2024 Sep 20;24(18):6075. doi: 10.3390/s24186075.

DOI:10.3390/s24186075
PMID:39338820
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11435784/
Abstract

In modern urban environments, visual sensors are crucial for enhancing the functionality of navigation systems, particularly for devices designed for visually impaired individuals. The high-resolution images captured by these sensors form the basis for understanding the surrounding environment and identifying key landmarks. However, the core challenge in the semantic segmentation of blind roads lies in the effective extraction of global context and edge features. Most existing methods rely on Convolutional Neural Networks (CNNs), whose inherent inductive biases limit their ability to capture global context and accurately detect discontinuous features such as gaps and obstructions in blind roads. To overcome these limitations, we introduce Dual-Branch Swin-CNN Net(DSC-Net), a new method that integrates the global modeling capabilities of the Swin-Transformer with the CNN-based U-Net architecture. This combination allows for the hierarchical extraction of both fine and coarse features. First, the Spatial Blending Module (SBM) mitigates blurring of target information caused by object occlusion to enhance accuracy. The hybrid attention module (HAM), embedded within the Inverted Residual Module (IRM), sharpens the detection of blind road boundaries, while the IRM improves the speed of network processing. In tests on a specialized dataset designed for blind road semantic segmentation in real-world scenarios, our method achieved an impressive mIoU of 97.72%. Additionally, it demonstrated exceptional performance on other public datasets.

摘要

在现代城市环境中,视觉传感器对于增强导航系统的功能至关重要,特别是对于为视障人士设计的设备。这些传感器捕获的高分辨率图像构成了理解周围环境和识别关键地标的基础。然而,盲道语义分割的核心挑战在于有效提取全局上下文和边缘特征。大多数现有方法依赖于卷积神经网络(CNN),其固有的归纳偏差限制了它们捕获全局上下文以及准确检测盲道中间隙和障碍物等不连续特征的能力。为了克服这些限制,我们引入了双分支Swin-CNN网络(DSC-Net),这是一种将Swin-Transformer的全局建模能力与基于CNN的U-Net架构相结合的新方法。这种结合允许分层提取精细和粗糙特征。首先,空间融合模块(SBM)减轻了目标遮挡导致的目标信息模糊,以提高准确性。嵌入在倒置残差模块(IRM)中的混合注意力模块(HAM)增强了对盲道边界的检测,而IRM提高了网络处理速度。在针对现实场景中盲道语义分割设计的专门数据集上进行的测试中,我们的方法实现了令人印象深刻的97.72%的平均交并比(mIoU)。此外,它在其他公共数据集上也表现出色。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/16b94a3177a5/sensors-24-06075-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/67549bed01c0/sensors-24-06075-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/153e7a48775f/sensors-24-06075-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/4e6c992984b7/sensors-24-06075-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/6689ebec7d66/sensors-24-06075-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/c4fd7bc7dca2/sensors-24-06075-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/45b3e1e05d52/sensors-24-06075-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/f8e2dfbbaa90/sensors-24-06075-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/fb0ff14f23a1/sensors-24-06075-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/97f9a5ee50d8/sensors-24-06075-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/4a185e60767d/sensors-24-06075-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/16b94a3177a5/sensors-24-06075-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/67549bed01c0/sensors-24-06075-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/153e7a48775f/sensors-24-06075-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/4e6c992984b7/sensors-24-06075-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/6689ebec7d66/sensors-24-06075-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/c4fd7bc7dca2/sensors-24-06075-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/45b3e1e05d52/sensors-24-06075-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/f8e2dfbbaa90/sensors-24-06075-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/fb0ff14f23a1/sensors-24-06075-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/97f9a5ee50d8/sensors-24-06075-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/4a185e60767d/sensors-24-06075-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d84/11435784/16b94a3177a5/sensors-24-06075-g011.jpg

相似文献

1
DSC-Net: Enhancing Blind Road Semantic Segmentation with Visual Sensor Using a Dual-Branch Swin-CNN Architecture.DSC-Net:使用双分支Swin-CNN架构通过视觉传感器增强盲道语义分割
Sensors (Basel). 2024 Sep 20;24(18):6075. doi: 10.3390/s24186075.
2
CPFTransformer: transformer fusion context pyramid medical image segmentation network.CPFTransformer:变换器融合上下文金字塔医学图像分割网络。
Front Neurosci. 2023 Dec 7;17:1288366. doi: 10.3389/fnins.2023.1288366. eCollection 2023.
3
Swin-Net: A Swin-Transformer-Based Network Combing with Multi-Scale Features for Segmentation of Breast Tumor Ultrasound Images.Swin-Net:一种基于Swin-Transformer并结合多尺度特征的用于乳腺肿瘤超声图像分割的网络。
Diagnostics (Basel). 2024 Jan 26;14(3):269. doi: 10.3390/diagnostics14030269.
4
ETU-Net: edge enhancement-guided U-Net with transformer for skin lesion segmentation.ETU-Net:基于边缘增强引导的 U-Net 与 Transformer 的皮肤病变分割。
Phys Med Biol. 2023 Dec 22;69(1). doi: 10.1088/1361-6560/ad13d2.
5
SwinCross: Cross-modal Swin transformer for head-and-neck tumor segmentation in PET/CT images.SwinCross:用于 PET/CT 图像中头颈部肿瘤分割的跨模态 Swin 变换器。
Med Phys. 2024 Mar;51(3):2096-2107. doi: 10.1002/mp.16703. Epub 2023 Sep 30.
6
Multi-task approach based on combined CNN-transformer for efficient segmentation and classification of breast tumors in ultrasound images.基于卷积神经网络(CNN)与变换器(Transformer)相结合的多任务方法用于超声图像中乳腺肿瘤的高效分割与分类
Vis Comput Ind Biomed Art. 2024 Jan 26;7(1):2. doi: 10.1186/s42492-024-00155-w.
7
TGDAUNet: Transformer and GCNN based dual-branch attention UNet for medical image segmentation.TGDAUNet:基于 Transformer 和 GCNN 的双分支注意力 U-Net 用于医学图像分割。
Comput Biol Med. 2023 Dec;167:107583. doi: 10.1016/j.compbiomed.2023.107583. Epub 2023 Oct 21.
8
Transformer-Based Model with Dynamic Attention Pyramid Head for Semantic Segmentation of VHR Remote Sensing Imagery.基于Transformer且带有动态注意力金字塔头的甚高分辨率遥感影像语义分割模型
Entropy (Basel). 2022 Nov 6;24(11):1619. doi: 10.3390/e24111619.
9
A dual-branch and dual attention transformer and CNN hybrid network for ultrasound image segmentation.一种用于超声图像分割的双分支双注意力Transformer与CNN混合网络。
Front Physiol. 2024 Sep 27;15:1432987. doi: 10.3389/fphys.2024.1432987. eCollection 2024.
10
iU-Net: a hybrid structured network with a novel feature fusion approach for medical image segmentation.iU-Net:一种具有用于医学图像分割的新型特征融合方法的混合结构网络。
BioData Min. 2023 Feb 21;16(1):5. doi: 10.1186/s13040-023-00320-6.

引用本文的文献

1
Low-Quality Sensor Data-Based Semi-Supervised Learning for Medical Image Segmentation.基于低质量传感器数据的医学图像分割半监督学习
Sensors (Basel). 2024 Dec 5;24(23):7799. doi: 10.3390/s24237799.

本文引用的文献

1
Centralized Feature Pyramid for Object Detection.用于目标检测的集中式特征金字塔
IEEE Trans Image Process. 2023;32:4341-4354. doi: 10.1109/TIP.2023.3297408. Epub 2023 Aug 2.
2
Contextual Transformer Networks for Visual Recognition.用于视觉识别的上下文Transformer网络
IEEE Trans Pattern Anal Mach Intell. 2023 Feb;45(2):1489-1500. doi: 10.1109/TPAMI.2022.3164083. Epub 2023 Jan 6.
3
An Improved DeepLab v3+ Deep Learning Network Applied to the Segmentation of Grape Leaf Black Rot Spots.一种应用于葡萄叶黑腐病斑分割的改进型深度卷积神经网络(DeepLab v3+)深度学习网络
Front Plant Sci. 2022 Feb 15;13:795410. doi: 10.3389/fpls.2022.795410. eCollection 2022.
4
Hybrid Deep Learning-Gaussian Process Network for Pedestrian Lane Detection in Unstructured Scenes.用于非结构化场景中行人车道检测的混合深度学习-高斯过程网络
IEEE Trans Neural Netw Learn Syst. 2020 Dec;31(12):5324-5338. doi: 10.1109/TNNLS.2020.2966246. Epub 2020 Nov 30.
5
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs.DeepLab:基于深度卷积网络、空洞卷积和全连接条件随机场的语义图像分割。
IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):834-848. doi: 10.1109/TPAMI.2017.2699184. Epub 2017 Apr 27.
6
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation.SegNet:一种用于图像分割的深度卷积编解码器架构。
IEEE Trans Pattern Anal Mach Intell. 2017 Dec;39(12):2481-2495. doi: 10.1109/TPAMI.2016.2644615. Epub 2017 Jan 2.
7
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition.空间金字塔池化在深度卷积网络中的视觉识别。
IEEE Trans Pattern Anal Mach Intell. 2015 Sep;37(9):1904-16. doi: 10.1109/TPAMI.2015.2389824.
8
Segmentation of pulmonary nodules in thoracic CT scans: a region growing approach.胸部CT扫描中肺结节的分割:一种区域生长方法。
IEEE Trans Med Imaging. 2008 Apr;27(4):467-80. doi: 10.1109/TMI.2007.907555.