OuYang Pu-Yun, He Yun, Guo Jian-Gui, Liu Jia-Ni, Wang Zhi-Long, Li Anwei, Li Jiajian, Yang Shan-Shan, Zhang Xu, Fan Wei, Wu Yi-Shan, Liu Zhi-Qiao, Zhang Bao-Yu, Zhao Ya-Nan, Gao Ming-Yong, Zhang Wei-Jun, Xie Chuan-Miao, Xie Fang-Yun
Department of Radiation Oncology, Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangzhou, China.
Department of Radiology, Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangdong Key Laboratory of Nasopharyngeal Carcinoma Diagnosis and Therapy, Guangzhou, China.
EClinicalMedicine. 2023 Aug 30;63:102202. doi: 10.1016/j.eclinm.2023.102202. eCollection 2023 Sep.
MRI is the routine examination to surveil the recurrence of nasopharyngeal carcinoma, but it has relatively lower sensitivity than PET/CT. We aimed to find if artificial intelligence (AI) could be competent pre-inspector for MRI radiologists and whether AI-aided MRI could perform better or even equal to PET/CT.
This multicenter study enrolled 6916 patients from five hospitals between September 2009 and October 2020. A 2.5D convolutional neural network diagnostic model and a nnU-Net contouring model were developed in the training and test cohorts and used to independently predict and visualize the recurrence of patients in the internal and external validation cohorts. We evaluated the area under the ROC curve (AUC) of AI and compared AI with MRI and PET/CT in sensitivity and specificity using the McNemar test. The prospective cohort was randomized into the AI and non-AI groups, and their sensitivity and specificity were compared using the Chi-square test.
The AI model achieved AUCs of 0.92 and 0.88 in the internal and external validation cohorts, corresponding to the sensitivity of 79.5% and 74.3% and specificity of 91.0% and 92.8%. It had comparable sensitivity to MRI (e.g., 74.3% vs. 74.7%, = 0.89) but lower sensitivity than PET/CT (77.9% vs. 92.0%, < 0.0001) at the same individual-specificities. The AI model achieved moderate precision with a median dice similarity coefficient of 0.67. AI-aided MRI improved specificity (92.5% vs. 85.0%, = 0.034), equaled PET/CT in the internal validation subcohort, and increased sensitivity (81.9% vs. 70.8%, = 0.021) in the external validation subcohort. In the prospective cohort of 1248 patients, the AI group had higher sensitivity than the non-AI group (78.6% vs. 67.3%, = 0.23), albeit nonsignificant. In future randomized controlled trials, a sample size of 3943 patients in each arm would be required to demonstrate the statistically significant difference.
The AI model equaled MRI by expert radiologists, and AI-aided MRI by expert radiologists equaled PET/CT. A larger randomized controlled trial is warranted to demonstrate the AI's benefit sufficiently.
The Sun Yat-sen University Clinical Research 5010 Program (2015020), Guangdong Basic and Applied Basic Research Foundation (2022A1515110356), and Guangzhou Science and Technology Program (2023A04J1788).
磁共振成像(MRI)是监测鼻咽癌复发的常规检查,但其敏感性相对低于正电子发射断层显像/X线计算机体层成像(PET/CT)。我们旨在探究人工智能(AI)能否胜任MRI放射科医生的预检查工作,以及AI辅助的MRI检查是否能表现得更好甚至与PET/CT相当。
这项多中心研究纳入了2009年9月至2020年10月期间来自五家医院的6916例患者。在训练队列和测试队列中开发了一个2.5D卷积神经网络诊断模型和一个nnU-Net轮廓模型,并用于独立预测和可视化内部和外部验证队列中患者的复发情况。我们评估了AI的受试者工作特征曲线下面积(AUC),并使用McNemar检验比较了AI与MRI和PET/CT在敏感性和特异性方面的差异。前瞻性队列被随机分为AI组和非AI组,并使用卡方检验比较它们的敏感性和特异性。
AI模型在内部和外部验证队列中的AUC分别为0.92和0.88,对应的敏感性分别为79.5%和74.3%,特异性分别为91.0%和92.8%。在相同的个体特异性下,其敏感性与MRI相当(例如,74.3%对74.7%,P = 0.89),但低于PET/CT(77.9%对92.0%,P < 0.0001)。AI模型具有中等精度,中位骰子相似系数为0.67。AI辅助的MRI提高了特异性(92.5%对85.0%,P = 0.034),在内部验证亚队列中与PET/CT相当,在外部验证亚队列中提高了敏感性(81.9%对70.8%,P = 0.021)。在1248例患者的前瞻性队列中,AI组的敏感性高于非AI组(78.6%对67.3%,P = 0.23),尽管差异无统计学意义。在未来的随机对照试验中,每组需要纳入3943例患者才能证明具有统计学显著差异。
AI模型与放射科专家的MRI检查效果相当,放射科专家的AI辅助MRI检查与PET/CT相当。有必要开展更大规模的随机对照试验以充分证明AI的益处。
中山大学临床研究5010计划(2015020)、广东省基础与应用基础研究基金(2022A1515110356)和广州市科技计划(2023A04J1788)。