Suppr超能文献

CVT-HNet:一种基于卷积神经网络(CNN)和视觉Transformer(ViT)的用于识别肛周瘘管性克罗恩病的融合模型。

CVT-HNet: a fusion model for recognizing perianal fistulizing Crohn's disease based on CNN and ViT.

作者信息

Li Lanlan, Wang Ziyue, Wang Chongyang, Chen Tao, Deng Ke, Wei Hong'an, Wang Dabiao, Li Juan, Zhang Heng

机构信息

College of Physics and Information Engineering, Fuzhou University, Fuzhou, 350108, China.

Fujian Key Laboratory for Intelligent Processing and Wireless Transmission of Media Information, Fuzhou University, Fuzhou, 350108, China.

出版信息

BMC Med Imaging. 2025 Jul 28;25(1):298. doi: 10.1186/s12880-025-01833-8.

Abstract

BACKGROUND

Accurate identification of anal fistulas is essential, as it directly impacts the severity of subsequent perianal infections, prognostic indicators, and overall treatment outcomes. Traditional manual recognition methods are inefficient. In response, computer vision methods have been adopted to improve efficiency. Convolutional neural networks(CNNs) are the main basis for detecting anal fistulas in current computer vision techniques. However, these methods often struggle to capture long-range dependencies effectively, which results in inadequate handling of images of anal fistulas.

METHODS

This study proposes a new fusion model, CVT-HNet, that integrates MobileNet with vision transformer technology. This design utilizes CNNs to extract local features and Transformers to capture long-range dependencies. In addition, the MobileNetV2 with Coordinate Attention mechanism and encoder modules are optimized to improve the precision of detecting anal fistulas.

RESULTS

Comparative experimental results show that CVT-HNet achieves an accuracy of 80.66% with significant robustness. It surpasses both pure Transformer architecture models and other fusion networks. Internal validation results demonstrate the reliability and consistency of CVT-HNet. External validation demonstrates that our model exhibits commendable transportability and generalizability. In visualization analysis, CVT-HNet exhibits a more concentrated focus on the region of interest in images of anal fistulas. Furthermore, the contribution of each CVT-HNet component module is evaluated by ablation experiments.

CONCLUSION

The experimental results highlight the superior performance and practicality of CVT-HNet in detecting anal fistulas. By combining local and global information, CVT-HNet demonstrates strong performance. The model not only achieves high accuracy and robustness but also exhibits strong generalizability. This makes it suitable for real-world applications where variability in data is common.These findings emphasize its effectiveness in clinical contexts.

摘要

背景

准确识别肛瘘至关重要,因为它直接影响后续肛周感染的严重程度、预后指标及整体治疗效果。传统的人工识别方法效率低下。为此,人们采用计算机视觉方法来提高效率。卷积神经网络(CNNs)是当前计算机视觉技术中检测肛瘘的主要基础。然而,这些方法往往难以有效捕捉长距离依赖关系,导致对肛瘘图像的处理不足。

方法

本研究提出一种新的融合模型CVT-HNet,它将MobileNet与视觉Transformer技术相结合。这种设计利用CNNs提取局部特征,利用Transformer捕捉长距离依赖关系。此外,对具有坐标注意力机制的MobileNetV2和编码器模块进行了优化,以提高肛瘘检测的精度。

结果

对比实验结果表明,CVT-HNet的准确率达到80.66%,具有显著的鲁棒性。它超越了纯Transformer架构模型和其他融合网络。内部验证结果证明了CVT-HNet的可靠性和一致性。外部验证表明,我们的模型具有良好的可迁移性和通用性。在可视化分析中,CVT-HNet对肛瘘图像感兴趣区域的关注更为集中。此外,通过消融实验评估了CVT-HNet各组件模块的贡献。

结论

实验结果突出了CVT-HNet在肛瘘检测中的卓越性能和实用性。通过结合局部和全局信息,CVT-HNet表现出强大的性能。该模型不仅实现了高精度和鲁棒性,还具有很强的通用性。这使其适用于数据变化常见的实际应用场景。这些发现强调了其在临床环境中的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/beb5/12305962/cf3a516079f0/12880_2025_1833_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验