Chen Binjun, Li Yike, Sun Yu, Sun Haojie, Wang Yanmei, Lyu Jihan, Guo Jiajie, Bao Shunxing, Cheng Yushu, Niu Xun, Yang Lian, Xu Jianghong, Yang Juanmei, Huang Yibo, Chi Fanglu, Liang Bo, Ren Dongdong
ENT Institute and Department of Otorhinolaryngology, Eye & ENT Hospital, Fudan University, Shanghai, China.
NHC Key Laboratory of Hearing Medicine Research, Eye & ENT Hospital, Fudan University, Shanghai, China.
J Med Internet Res. 2024 Aug 8;26:e51706. doi: 10.2196/51706.
Temporal bone computed tomography (CT) helps diagnose chronic otitis media (COM). However, its interpretation requires training and expertise. Artificial intelligence (AI) can help clinicians evaluate COM through CT scans, but existing models lack transparency and may not fully leverage multidimensional diagnostic information.
We aimed to develop an explainable AI system based on 3D convolutional neural networks (CNNs) for automatic CT-based evaluation of COM.
Temporal bone CT scans were retrospectively obtained from patients operated for COM between December 2015 and July 2021 at 2 independent institutes. A region of interest encompassing the middle ear was automatically segmented, and 3D CNNs were subsequently trained to identify pathological ears and cholesteatoma. An ablation study was performed to refine model architecture. Benchmark tests were conducted against a baseline 2D model and 7 clinical experts. Model performance was measured through cross-validation and external validation. Heat maps, generated using Gradient-Weighted Class Activation Mapping, were used to highlight critical decision-making regions. Finally, the AI system was assessed with a prospective cohort to aid clinicians in preoperative COM assessment.
Internal and external data sets contained 1661 and 108 patients (3153 and 211 eligible ears), respectively. The 3D model exhibited decent performance with mean areas under the receiver operating characteristic curves of 0.96 (SD 0.01) and 0.93 (SD 0.01), and mean accuracies of 0.878 (SD 0.017) and 0.843 (SD 0.015), respectively, for detecting pathological ears on the 2 data sets. Similar outcomes were observed for cholesteatoma identification (mean area under the receiver operating characteristic curve 0.85, SD 0.03 and 0.83, SD 0.05; mean accuracies 0.783, SD 0.04 and 0.813, SD 0.033, respectively). The proposed 3D model achieved a commendable balance between performance and network size relative to alternative models. It significantly outperformed the 2D approach in detecting COM (P≤.05) and exhibited a substantial gain in identifying cholesteatoma (P<.001). The model also demonstrated superior diagnostic capabilities over resident fellows and the attending otologist (P<.05), rivaling all senior clinicians in both tasks. The generated heat maps properly highlighted the middle ear and mastoid regions, aligning with human knowledge in interpreting temporal bone CT. The resulting AI system achieved an accuracy of 81.8% in generating preoperative diagnoses for 121 patients and contributed to clinical decision-making in 90.1% cases.
We present a 3D CNN model trained to detect pathological changes and identify cholesteatoma via temporal bone CT scans. In both tasks, this model significantly outperforms the baseline 2D approach, achieving levels comparable with or surpassing those of human experts. The model also exhibits decent generalizability and enhanced comprehensibility. This AI system facilitates automatic COM assessment and shows promising viability in real-world clinical settings. These findings underscore AI's potential as a valuable aid for clinicians in COM evaluation.
Chinese Clinical Trial Registry ChiCTR2000036300; https://www.chictr.org.cn/showprojEN.html?proj=58685.
颞骨计算机断层扫描(CT)有助于诊断慢性中耳炎(COM)。然而,其解读需要培训和专业知识。人工智能(AI)可以帮助临床医生通过CT扫描评估COM,但现有模型缺乏透明度,可能无法充分利用多维诊断信息。
我们旨在开发一种基于三维卷积神经网络(CNN)的可解释人工智能系统,用于基于CT自动评估COM。
回顾性收集2015年12月至2021年7月期间在2个独立机构接受COM手术患者的颞骨CT扫描数据。自动分割包含中耳的感兴趣区域,随后训练三维CNN以识别病变耳和胆脂瘤。进行消融研究以优化模型架构。与基线二维模型和7名临床专家进行基准测试。通过交叉验证和外部验证来衡量模型性能。使用梯度加权类激活映射生成的热图用于突出关键决策区域。最后,以前瞻性队列评估人工智能系统,以协助临床医生进行术前COM评估。
内部和外部数据集分别包含1661例和108例患者(3153只和211只符合条件的耳)。三维模型表现出良好的性能,在两个数据集上检测病变耳时,受试者操作特征曲线下的平均面积分别为0.96(标准差0.01)和0.93(标准差0.01),平均准确率分别为0.878(标准差0.017)和0.843(标准差0.015)。在胆脂瘤识别方面也观察到类似结果(受试者操作特征曲线下的平均面积分别为0.85,标准差0.03和0.83,标准差0.05;平均准确率分别为0.783,标准差0.04和0.813,标准差0.033)。相对于其他模型,所提出的三维模型在性能和网络规模之间实现了可观的平衡。在检测COM方面,它显著优于二维方法(P≤0.05),在识别胆脂瘤方面有显著提高(P<0.001)。该模型在诊断能力上也优于住院医师和主治耳科医生(P<0.05),在两项任务中与所有资深临床医生相当。生成的热图正确地突出了中耳和乳突区域,与解读颞骨CT的人类知识相符。最终的人工智能系统在为121例患者生成术前诊断时准确率达到81.8%,并在90.1%的病例中有助于临床决策。
我们提出了一种经过训练可通过颞骨CT扫描检测病理变化并识别胆脂瘤的三维CNN模型。在这两项任务中,该模型均显著优于基线二维方法,达到了与人类专家相当或超越人类专家的水平。该模型还表现出良好的泛化能力和更高的可理解性。这种人工智能系统有助于自动评估COM,并在实际临床环境中显示出有前景的可行性。这些发现强调了人工智能在COM评估中作为临床医生有价值辅助工具的潜力。
中国临床试验注册中心ChiCTR2000036300;https://www.chictr.org.cn/showprojEN.html?proj=58685