一种具有噪声传递机制的大型内核卷积神经网络用于实时语义分割

A Large Kernel Convolutional Neural Network with a Noise Transfer Mechanism for Real-Time Semantic Segmentation.

作者信息

Liu Jinhang, Du Yuhe, Wang Jing, Tang Xing

机构信息

School of Computer Science, Hubei University of Technology, Wuhan 430070, China.

Key Laboratory of Green Intelligent Computing Network in Hubei Province, Wuhan 430068, China.

出版信息

Sensors (Basel). 2025 Aug 29;25(17):5357. doi: 10.3390/s25175357.

DOI:10.3390/s25175357

PMID:40942785

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12431489/

Abstract

In semantic segmentation tasks, large kernels and Atrous convolution have been utilized to increase the receptive field, enabling models to achieve competitive performance with fewer parameters. However, due to the fixed size of kernel functions, networks incorporating large convolutional kernels are limited in adaptively capturing multi-scale features and fail to effectively leverage global contextual information. To address this issue, we combine Atrous convolution with large kernel convolution, using different dilation rates to compensate for the single-scale receptive field limitation of large kernels. Simultaneously, we employ a dynamic selection mechanism to adaptively highlight the most important spatial features based on global information. Additionally, to enhance the model's ability to fit the true label distribution, we propose a Multi-Scale Contextual Noise Transfer Matrix (NTM), which uses high-order consistency information from neighborhood representations to estimate NTM and correct supervision signals, thereby improving the model's generalization capability. Extensive experiments conducted on Cityscapes, ADE20K, and COCO-Stuff-10K demonstrate that this approach achieves a new state-of-the-art balance between speed and accuracy. Specifically, LKNTNet achieves 80.05% mIoU on Cityscapes with an inference speed of 80.7 FPS and 42.7% mIoU on ADE20K with an inference speed of 143.6 FPS.

摘要

在语义分割任务中，大内核和空洞卷积已被用于扩大感受野，使模型能够用更少的参数实现有竞争力的性能。然而，由于内核函数的大小固定，包含大卷积内核的网络在自适应捕捉多尺度特征方面受到限制，并且无法有效利用全局上下文信息。为了解决这个问题，我们将空洞卷积与大内核卷积相结合，使用不同的扩张率来弥补大内核单尺度感受野的局限性。同时，我们采用动态选择机制，基于全局信息自适应地突出最重要的空间特征。此外，为了增强模型拟合真实标签分布的能力，我们提出了一种多尺度上下文噪声转移矩阵（NTM），它利用邻域表示中的高阶一致性信息来估计NTM并校正监督信号，从而提高模型的泛化能力。在Cityscapes、ADE20K和COCO-Stuff-10K上进行的大量实验表明，这种方法在速度和准确性之间实现了新的最优平衡。具体而言，LKNTNet在Cityscapes上达到了80.05%的平均交并比，推理速度为80.7 FPS，在ADE20K上达到了42.7%的平均交并比，推理速度为143.6 FPS。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/856c/12431489/841e2c86857a/sensors-25-05357-g001.jpg

相似文献

A Large Kernel Convolutional Neural Network with a Noise Transfer Mechanism for Real-Time Semantic Segmentation.一种具有噪声传递机制的大型内核卷积神经网络用于实时语义分割

Sensors (Basel). 2025 Aug 29;25(17):5357. doi: 10.3390/s25175357.

A novel image segmentation network with multi-scale and flow-guided attention for early screening of vaginal intraepithelial neoplasia (VAIN).一种用于阴道上皮内瘤变（VAIN）早期筛查的具有多尺度和流引导注意力的新型图像分割网络。

Med Phys. 2025 Aug;52(8):e18041. doi: 10.1002/mp.18041.

Prescription of Controlled Substances: Benefits and Risks管制药品的处方：益处与风险

EPSegNet: Lightweight Semantic Recalibration and Assembly for Efficient Polyp Segmentation.EPSegNet：用于高效息肉分割的轻量级语义重新校准与组装

IEEE Trans Neural Netw Learn Syst. 2025 Aug;36(8):13805-13817. doi: 10.1109/TNNLS.2025.3527557.

DCE-UNet: A Transformer-Based Fully Automated Segmentation Network for Multiple Adolescent Spinal Disorders in X-ray Images.DCE-UNet：一种基于Transformer的用于X射线图像中多种青少年脊柱疾病的全自动分割网络。

Biomed Phys Eng Express. 2025 Aug 21. doi: 10.1088/2057-1976/adfde9.

Semantic consistency-guided patch-wise relation graph reasoning scheme for lung cancer organoid segmentation in brightfield microscopy.用于明场显微镜下肺癌类器官分割的语义一致性引导的逐块关系图推理方案

Comput Methods Programs Biomed. 2025 Nov;271:108964. doi: 10.1016/j.cmpb.2025.108964. Epub 2025 Jul 23.

DGCFNet: Dual Global Context Fusion Network for remote sensing image semantic segmentation.DGCFNet：用于遥感图像语义分割的双全局上下文融合网络

PeerJ Comput Sci. 2025 Mar 27;11:e2786. doi: 10.7717/peerj-cs.2786. eCollection 2025.

LKDA-Net: Hierarchical transformer with large Kernel depthwise convolution attention for 3D medical image segmentation.LKDA-Net：用于3D医学图像分割的具有大内核深度卷积注意力的分层变压器。

PLoS One. 2025 Aug 8;20(8):e0329806. doi: 10.1371/journal.pone.0329806. eCollection 2025.

Short-Term Memory Impairment短期记忆障碍

Towards Real Zero-Shot Camouflaged Object Segmentation without Camouflaged Annotations.迈向无需伪装标注的真正零样本伪装物体分割

IEEE Trans Pattern Anal Mach Intell. 2025 Sep 10;PP. doi: 10.1109/TPAMI.2025.3600461.

本文引用的文献

UniMatch V2: Pushing the Limit of Semi-Supervised Semantic Segmentation.UniMatch V2：突破半监督语义分割的极限

IEEE Trans Pattern Anal Mach Intell. 2025 Apr;47(4):3031-3048. doi: 10.1109/TPAMI.2025.3528453. Epub 2025 Mar 6.

Cross-Image Pixel Contrasting for Semantic Segmentation.用于语义分割的跨图像像素对比

IEEE Trans Pattern Anal Mach Intell. 2024 Aug;46(8):5398-5412. doi: 10.1109/TPAMI.2024.3367952. Epub 2024 Jul 2.

PITS: An Intelligent Transportation System in pandemic times.PITS：大流行时期的智能交通系统。

Eng Appl Artif Intell. 2022 Sep;114:105154. doi: 10.1016/j.engappai.2022.105154. Epub 2022 Jul 8.

Coarse-to-Fine Semantic Segmentation From Image-Level Labels.从图像级标签进行粗到细的语义分割。

IEEE Trans Image Process. 2020;29:225-236. doi: 10.1109/TIP.2019.2926748. Epub 2019 Jul 12.

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation.SegNet：一种用于图像分割的深度卷积编解码器架构。

IEEE Trans Pattern Anal Mach Intell. 2017 Dec;39(12):2481-2495. doi: 10.1109/TPAMI.2016.2644615. Epub 2017 Jan 2.

Deep learning.深度学习。

Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一种具有噪声传递机制的大型内核卷积神经网络用于实时语义分割

A Large Kernel Convolutional Neural Network with a Noise Transfer Mechanism for Real-Time Semantic Segmentation.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献