Suppr超能文献

基于视觉Transformer的CT成像中腰椎间盘突出症的Grad-CAM可解释性诊断

Vision transformer-based diagnosis of lumbar disc herniation with grad-CAM interpretability in CT imaging.

作者信息

Chu Qingsong, Wang Xingyu, Lv Hao, Zhou Yao, Jiang Ting

机构信息

The First Affiliated Hospital of Anhui University of Chinese Medicine, Hefei, China.

Anhui University of Chinese Medicine, Hefei, China.

出版信息

BMC Musculoskelet Disord. 2025 Apr 29;26(1):419. doi: 10.1186/s12891-025-08602-2.

Abstract

BACKGROUND

In this study, a computed tomography (CT)-vision transformer (ViT) framework for diagnosing lumbar disc herniation (LDH) was proposed for the first time by taking advantage of the multidirectional advantages of CT and a ViT.

METHODS

The proposed ViT model was trained and validated on a dataset consisting of 983 patients, including 2100 CT images. We compared the performance of the ViT model with that of several convolutional neural networks (CNNs), including ResNet18, ResNet50, LeNet, AlexNet, and VGG16, across two primary tasks: vertebra localization and disc abnormality classification.

RESULTS

The integration of a ViT with CT imaging allowed the constructed model to capture the complex spatial relationships and global dependencies within scans, outperforming CNN models and achieving accuracies of 97.13% and 93.63% in terms of vertebra localization and disc abnormality classification, respectively. The performance of the model was further validated via gradient-weighted class activation mapping (Grad-CAM), providing interpretable insights into the regions of the CT scans that contributed to the model predictions.

CONCLUSION

This study demonstrated the potential of a ViT for diagnosing LDH using CT imaging. The results highlight the promising clinical applications of this approach, particularly for enhancing the diagnostic efficiency and transparency of medical AI systems.

摘要

背景

在本研究中,首次提出了一种利用计算机断层扫描(CT)和视觉Transformer(ViT)的多方向优势来诊断腰椎间盘突出症(LDH)的CT-ViT框架。

方法

在一个由983名患者(包括2100张CT图像)组成的数据集上对所提出的ViT模型进行训练和验证。我们在两个主要任务(椎体定位和椎间盘异常分类)上比较了ViT模型与几个卷积神经网络(CNN)(包括ResNet18、ResNet50、LeNet、AlexNet和VGG16)的性能。

结果

ViT与CT成像的集成使构建的模型能够捕捉扫描内复杂的空间关系和全局依赖性,优于CNN模型,在椎体定位和椎间盘异常分类方面的准确率分别达到97.13%和93.63%。通过梯度加权类激活映射(Grad-CAM)进一步验证了模型的性能,为有助于模型预测的CT扫描区域提供了可解释的见解。

结论

本研究证明了ViT在利用CT成像诊断LDH方面的潜力。结果突出了这种方法在临床应用中的前景,特别是在提高医学人工智能系统的诊断效率和透明度方面。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6a00/12039304/90233952e9d3/12891_2025_8602_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验