Gupta Siddharth, Dubey Arun K, Singh Rajesh, Kalra Mannudeep K, Abraham Ajith, Kumari Vandana, Laird John R, Al-Maini Mustafa, Gupta Neha, Singh Inder, Viskovic Klaudija, Saba Luca, Suri Jasjit S
Department of Computer Science and Engineering, Bharati Vidyapeeth's College of Engineering, New Delhi 110063, India.
Department of Information Technology, Bharati Vidyapeeth's College of Engineering, New Delhi 110063, India.
Diagnostics (Basel). 2024 Jul 16;14(14):1534. doi: 10.3390/diagnostics14141534.
: Diagnosing lung diseases accurately is crucial for proper treatment. Convolutional neural networks (CNNs) have advanced medical image processing, but challenges remain in their accurate explainability and reliability. This study combines U-Net with attention and Vision Transformers (ViTs) to enhance lung disease segmentation and classification. We hypothesize that Attention U-Net will enhance segmentation accuracy and that ViTs will improve classification performance. The explainability methodologies will shed light on model decision-making processes, aiding in clinical acceptance. : A comparative approach was used to evaluate deep learning models for segmenting and classifying lung illnesses using chest X-rays. The Attention U-Net model is used for segmentation, and architectures consisting of four CNNs and four ViTs were investigated for classification. Methods like Gradient-weighted Class Activation Mapping plus plus (Grad-CAM++) and Layer-wise Relevance Propagation (LRP) provide explainability by identifying crucial areas influencing model decisions. : The results support the conclusion that ViTs are outstanding in identifying lung disorders. Attention U-Net obtained a Dice Coefficient of 98.54% and a Jaccard Index of 97.12%. ViTs outperformed CNNs in classification tasks by 9.26%, reaching an accuracy of 98.52% with MobileViT. An 8.3% increase in accuracy was seen while moving from raw data classification to segmented image classification. Techniques like Grad-CAM++ and LRP provided insights into the decision-making processes of the models. : This study highlights the benefits of integrating Attention U-Net and ViTs for analyzing lung diseases, demonstrating their importance in clinical settings. Emphasizing explainability clarifies deep learning processes, enhancing confidence in AI solutions and perhaps enhancing clinical acceptance for improved healthcare results.
准确诊断肺部疾病对于正确治疗至关重要。卷积神经网络(CNN)推动了医学图像处理的发展,但在其准确的可解释性和可靠性方面仍存在挑战。本研究将U-Net与注意力机制和视觉Transformer(ViT)相结合,以增强肺部疾病的分割和分类。我们假设注意力U-Net将提高分割精度,而ViT将改善分类性能。可解释性方法将揭示模型的决策过程,有助于临床应用。
采用比较方法评估用于通过胸部X光对肺部疾病进行分割和分类的深度学习模型。注意力U-Net模型用于分割,研究了由四个CNN和四个ViT组成的架构用于分类。诸如梯度加权类激活映射增强版(Grad-CAM++)和逐层相关传播(LRP)等方法通过识别影响模型决策的关键区域来提供可解释性。
结果支持了ViT在识别肺部疾病方面表现出色的结论。注意力U-Net获得了98.54%的骰子系数和97.12%的杰卡德指数。在分类任务中,ViT比CNN的表现高出9.26%,使用MobileViT时准确率达到98.52%。从原始数据分类到分割图像分类,准确率提高了8.3%。Grad-CAM++和LRP等技术为模型的决策过程提供了见解。
本研究强调了整合注意力U-Net和ViT用于分析肺部疾病的好处,证明了它们在临床环境中的重要性。强调可解释性阐明了深度学习过程,增强了对人工智能解决方案的信心,并可能提高临床接受度以改善医疗结果。