Radiology Department, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Haining Rd.100, Shanghai, 200080, China.
Department of Epidemiology, University Medical Center Groningen, University of Groningen, Hanzeplein 1, 9713 GZ, Groningen, The Netherlands.
Eur Radiol. 2021 Oct;31(10):7303-7315. doi: 10.1007/s00330-021-07901-1. Epub 2021 Apr 13.
The interpretability of convolutional neural networks (CNNs) for classifying subsolid nodules (SSNs) is insufficient for clinicians. Our purpose was to develop CNN models to classify SSNs on CT images and to investigate image features associated with the CNN classification.
CT images containing SSNs with a diameter of ≤ 3 cm were retrospectively collected. We trained and validated CNNs by a 5-fold cross-validation method for classifying SSNs into three categories (benign and preinvasive lesions [PL], minimally invasive adenocarcinoma [MIA], and invasive adenocarcinoma [IA]) that were histologically confirmed or followed up for 6.4 years. The mechanism of CNNs on human-recognizable CT image features was investigated and visualized by gradient-weighted class activation map (Grad-CAM), separated activation channels and areas, and DeepDream algorithm.
The accuracy was 93% for classifying 586 SSNs from 569 patients into three categories (346 benign and PL, 144 MIA, and 96 IA in 5-fold cross-validation). The Grad-CAM successfully located the entire region of image features that determined the final classification. Activated areas in the benign and PL group were primarily smooth margins (p < 0.001) and ground-glass components (p = 0.033), whereas in the IA group, the activated areas were mainly part-solid (p < 0.001) and solid components (p < 0.001), lobulated shapes (p < 0.001), and air bronchograms (p < 0.001). However, the activated areas for MIA were variable. The DeepDream algorithm showed the image features in a human-recognizable pattern that the CNN learned from a training dataset.
This study provides medical evidence to interpret the mechanism of CNNs that helps support the clinical application of artificial intelligence.
• CNN achieved high accuracy (93%) in classifying subsolid nodules on CT images into three categories: benign and preinvasive lesions, MIA, and IA. • The gradient-weighted class activation map (Grad-CAM) located the entire region of image features that determined the final classification, and the visualization of the separated activated areas was consistent with radiologists' expertise for diagnosing subsolid nodules. • DeepDream showed the image features that CNN learned from a training dataset in a human-recognizable pattern.
卷积神经网络(CNN)对分类亚实性结节(SSN)的可解释性还不能满足临床医生的需求。本研究旨在开发用于 CT 图像中 SSN 分类的 CNN 模型,并研究与 CNN 分类相关的图像特征。
回顾性收集了直径≤3cm 的 SSN 的 CT 图像。我们采用 5 折交叉验证方法对 CNN 进行训练和验证,将 SSN 分为三类(良性和前驱病变[PL]、微浸润性腺癌[MIA]和浸润性腺癌[IA]),这些分类结果均通过组织学证实或随访 6.4 年。通过梯度加权类激活映射(Grad-CAM)、分离激活通道和区域以及 DeepDream 算法,研究了 CNN 对人类可识别 CT 图像特征的作用机制,并对其进行可视化。
在 5 折交叉验证中,将 569 例患者的 586 个 SSN 分为三类(346 个良性和 PL、144 个 MIA 和 96 个 IA),准确率为 93%。Grad-CAM 成功定位到决定最终分类的整个图像特征区域。良性和 PL 组的激活区域主要为光滑边缘(p<0.001)和磨玻璃成分(p=0.033),而 IA 组的激活区域主要为部分实性(p<0.001)和实性成分(p<0.001)、分叶状(p<0.001)和空气支气管征(p<0.001)。然而,MIA 的激活区域则各不相同。DeepDream 算法以一种人类可识别的模式显示了 CNN 从训练数据集中学习到的图像特征。
本研究为解释 CNN 机制提供了医学证据,有助于支持人工智能在临床中的应用。
CNN 在 CT 图像上将亚实性结节分类为三类(良性和前驱病变、MIA 和 IA)的准确率达到 93%。
Grad-CAM 定位到决定最终分类的整个图像特征区域,且分离激活区域的可视化结果与放射科医生诊断亚实性结节的专业知识一致。
DeepDream 以人类可识别的模式显示了 CNN 从训练数据集中学习到的图像特征。