Suppr超能文献

RTC_TongueNet:一种基于DeepLabV3的改进型舌图像分割模型。

RTC_TongueNet: An improved tongue image segmentation model based on DeepLabV3.

作者信息

Tang Yan, Tan Daiqing, Li Huixia, Zhu Muhua, Li Xiaohui, Wang Xuan, Wang JiaQi, Wang Zaijian, Gao Chenxi, Wang Ji, Han Aiqing

机构信息

Beijing University of Chinese Medicine, Beijing, China.

The Third Affiliated Hospital of Beijing University of Chinese Medicine, Beijing, China.

出版信息

Digit Health. 2024 Mar 28;10:20552076241242773. doi: 10.1177/20552076241242773. eCollection 2024 Jan-Dec.

Abstract

OBJECTIVE

Tongue segmentation as a basis for automated tongue recognition studies in Chinese medicine, which has defects such as network degradation and inability to obtain global features, which seriously affects the segmentation effect. This article proposes an improved model RTC_TongueNet based on DeepLabV3, which combines the improved residual structure and transformer and integrates the ECA (Efficient Channel Attention Module) attention mechanism of multiscale atrous convolution to improve the effect of tongue image segmentation.

METHODS

In this paper, we improve the backbone network based on DeepLabV3 by incorporating the transformer structure and an improved residual structure. The residual module is divided into two structures and uses different residual structures under different conditions to speed up the frequency of shallow information mapping to deep network, which can more effectively extract the underlying features of tongue image; introduces ECA attention mechanism after concat operation in ASPP (Atrous Spatial Pyramid Pooling) structure to strengthen information interaction and fusion, effectively extract local and global features, and enable the model to focus more on difficult-to-separate areas such as tongue edge, to obtain better segmentation effect.

RESULTS

The RTC_TongueNet network model was compared with FCN (Fully Convolutional Networks), UNet, LRASPP (Lite Reduced ASPP), and DeepLabV3 models on two datasets. On the two datasets, the MIOU (Mean Intersection over Union) and MPA (Mean Pixel Accuracy) values of the classic model DeepLabV3 were higher than those of FCN, UNet, and LRASPP models, and the performance was better. Compared with the DeepLabV3 model, the RTC_TongueNet network model increased MIOU value by 0.9% and MPA value by 0.3% on the first dataset; MIOU increased by 1.0% and MPA increased by 1.1% on the second dataset. RTC_TongueNet model performed best on both datasets.

CONCLUSION

In this study, based on DeepLabV3, we apply the improved residual structure and transformer as a backbone to fully extract image features locally and globally. The ECA attention module is combined to enhance channel attention, strengthen useful information, and weaken the interference of useless information. RTC_TongueNet model can effectively segment tongue images. This study has practical application value and reference value for tongue image segmentation.

摘要

目的

舌部分割作为中医自动舌象识别研究的基础,存在网络退化和无法获取全局特征等缺陷,严重影响分割效果。本文提出一种基于DeepLabV3的改进模型RTC_TongueNet,它结合了改进的残差结构和Transformer,并集成了多尺度空洞卷积的ECA(高效通道注意力模块)注意力机制,以提高舌象图像分割效果。

方法

本文通过引入Transformer结构和改进的残差结构对基于DeepLabV3的主干网络进行改进。残差模块分为两种结构,并在不同条件下使用不同的残差结构,加快浅层信息映射到深层网络的频率,能更有效地提取舌象图像的底层特征;在空洞空间金字塔池化(ASPP)结构的拼接操作后引入ECA注意力机制,加强信息交互与融合,有效提取局部和全局特征,使模型更关注舌边缘等难以分割的区域,以获得更好的分割效果。

结果

RTC_TongueNet网络模型在两个数据集上与全卷积网络(FCN)、U-Net、轻量级缩减空洞空间金字塔池化(LRASPP)和DeepLabV3模型进行比较。在这两个数据集上,经典模型DeepLabV3的平均交并比(MIOU)和平均像素精度(MPA)值高于FCN、U-Net和LRASPP模型,性能更好。与DeepLabV3模型相比,RTC_TongueNet网络模型在第一个数据集上MIOU值提高了0.9%,MPA值提高了0.3%;在第二个数据集上MIOU提高了1.0%,MPA提高了1.1%。RTC_TongueNet模型在两个数据集上表现最佳。

结论

在本研究中,基于DeepLabV3,我们应用改进的残差结构和Transformer作为主干,在局部和全局充分提取图像特征。结合ECA注意力模块增强通道注意力,强化有用信息,弱化无用信息的干扰。RTC_TongueNet模型能够有效分割舌象图像。本研究对舌象图像分割具有实际应用价值和参考价值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a643/10976494/472ee00b65a5/10.1177_20552076241242773-fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验