Suppr超能文献

用于处理医学图像中噪声标签和加权冗余的全局上下文和局部交互中的部分注意力机制

Partial Attention in Global Context and Local Interaction for Addressing Noisy Labels and Weighted Redundancies on Medical Images.

作者信息

Nguyen Minh Tai Pham, Phan Tran Minh Khue, Nakano Tadashi, Tran Thi Hong, Nguyen Quoc Duy Nam

机构信息

Faculty of Advanced Program, Ho Chi Minh City Open University, Ho Chi Minh City 700000, Vietnam.

Faculty of Information Technology, Ho Chi Minh City Open University, Ho Chi Minh City 700000, Vietnam.

出版信息

Sensors (Basel). 2024 Dec 30;25(1):163. doi: 10.3390/s25010163.

Abstract

Recently, the application of deep neural networks to detect anomalies on medical images has been facing the appearance of noisy labels, including overlapping objects and similar classes. Therefore, this study aims to address this challenge by proposing a unique attention module that can assist deep neural networks in focusing on important object features in noisy medical image conditions. This module integrates global context modeling to create long-range dependencies and local interactions to enable channel attention ability by using 1D convolution that not only performs well with noisy labels but also consumes significantly less resources without any dimensionality reduction. The module is then named Global Context and Local Interaction (GCLI). We have further experimented and proposed a partial attention strategy for the proposed GCLI module, aiming to efficiently reduce weighted redundancies. This strategy utilizes a subset of channels for GCLI to produce attention weights instead of considering every single channel. As a result, this strategy can greatly reduce the risk of introducing weighted redundancies caused by modeling global context. For classification, our proposed method is able to assist ResNet34 in achieving up to 82.5% accuracy on the Chaoyang test set, which is the highest figure among the other SOTA attention modules without using any processing filter to reduce the effect of noisy labels. For object detection, the GCLI is able to boost the capability of YOLOv8 up to 52.1% mAP50 on the GRAZPEDWRI-DX test set, demonstrating the highest performance among other attention modules and ranking second in the mAP50 metric on the VinDR-CXR test set. In terms of model complexity, our proposed GCLI module can consume fewer extra parameters up to 225 times and has inference speed faster than 30% compared to the other attention modules.

摘要

最近,将深度神经网络应用于医学图像异常检测面临着噪声标签的出现,包括重叠对象和相似类别。因此,本研究旨在通过提出一种独特的注意力模块来应对这一挑战,该模块可以帮助深度神经网络在有噪声的医学图像条件下专注于重要的对象特征。该模块集成了全局上下文建模以创建长程依赖关系,并通过使用1D卷积实现局部交互以具备通道注意力能力,这种1D卷积不仅在处理噪声标签时表现良好,而且在不进行任何降维的情况下消耗的资源显著减少。该模块随后被命名为全局上下文与局部交互(GCLI)。我们进一步进行了实验,并为所提出的GCLI模块提出了一种部分注意力策略,旨在有效减少加权冗余。该策略利用GCLI的通道子集来生成注意力权重,而不是考虑每一个通道。结果,该策略可以大大降低因对全局上下文建模而引入加权冗余的风险。对于分类,我们提出的方法能够帮助ResNet34在朝阳测试集上达到高达82.5%的准确率,这是在不使用任何处理滤波器来减少噪声标签影响的情况下,其他最先进注意力模块中的最高数字。对于目标检测,GCLI能够在GRAZPEDWRI-DX测试集上将YOLOv8的能力提升至52.1% mAP50,在其他注意力模块中表现出最高性能,在VinDR-CXR测试集的mAP50指标中排名第二。在模型复杂度方面,我们提出的GCLI模块与其他注意力模块相比,可以消耗少至225倍的额外参数,推理速度快30%以上。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f215/11722591/c7bfa53e9e38/sensors-25-00163-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验