Zhang Heng, Li Wenru, Chen Tao, Deng Ke, Yang Bolin, Luo Jingen, Yao Jiaying, Lin Yuhuan, Li Juan, Meng Xiaochun, Lin Hongcheng, Ren Donglin, Li Lanlan
Department of General Surgery (Colorectal Surgery), The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, Guangdong 510655, PR China.
Guangdong Provincial Key Laboratory of Colorectal and Pelvic Floor Diseases, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, Guangdong 510655, PR China.
EClinicalMedicine. 2024 Nov 22;78:102940. doi: 10.1016/j.eclinm.2024.102940. eCollection 2024 Dec.
A singular reliable modality for early distinguishing perianal fistulizing Crohn's disease (PFCD) from cryptoglandular fistula (CGF) is currently lacking. We aimed to develop and validate an MRI-based deep learning classifier to effectively discriminate between them.
The present study retrospectively enrolled 1054 patients with PFCD or CGF from three Chinese tertiary referral hospitals between January 1, 2015, and December 31, 2021. The patients were divided into four cohorts: training cohort (n = 800), validation cohort (n = 100), internal test cohort (n = 100) and external test cohort (n = 54). Two deep convolutional neural networks (DCNN), namely MobileNetV2 and ResNet50, were respectively trained using the transfer learning strategy on a dataset consisting of 44871 MR images. The performance of the DCNN models was compared to that of radiologists using various metrics, including receiver operating characteristic curve (ROC) analysis, accuracy, sensitivity, and specificity. Delong testing was employed for comparing the area under curves (AUCs). Univariate and multivariate analyses were conducted to explore potential factors associated with classifier performance.
A total of 532 PFCD and 522 CGF patients were included. Both pre-trained DCNN classifiers achieved encouraging performances in the internal test cohort (MobileNetV2 AUC: 0.962, 95% CI 0.903-0.990; ResNet50 AUC: 0.963, 95% CI 0.905-0.990), as well as external test cohort (MobileNetV2 AUC: 0.885, 95% CI 0.769-0.956; ResNet50 AUC: 0.874, 95% CI 0.756-0.949). They had greater AUCs than the radiologists (all p ≤ 0.001), while had comparable AUCs to each other (p = 0.83 and p = 0.60 in the two test cohorts). None of the potential characteristics had a significant impact on the performance of pre-trained MobileNetV2 classifier in etiologic diagnosis. Previous fistula surgery influenced the performance of the pre-trained ResNet50 classifier in the internal test cohort (OR 0.157, 95% CI 0.025-0.997, p = 0.05).
The developed DCNN classifiers exhibited superior robustness in distinguishing PFCD from CGF compared to artificial visual assessment, showing their potential for assisting in early detection of PFCD. Our findings highlight the promising generalized performance of MobileNetV2 over ResNet50, rendering it suitable for deployment on mobile terminals.
National Natural Science Foundation of China.
目前缺乏一种能够早期可靠区分肛周瘘管型克罗恩病(PFCD)和隐窝腺性瘘管(CGF)的单一有效方法。我们旨在开发并验证一种基于磁共振成像(MRI)的深度学习分类器,以有效区分这两种疾病。
本研究回顾性纳入了2015年1月1日至2021年12月31日期间来自中国三家三级转诊医院的1054例PFCD或CGF患者。患者被分为四个队列:训练队列(n = 800)、验证队列(n = 100)、内部测试队列(n = 100)和外部测试队列(n = 54)。使用迁移学习策略,分别在由44871张MR图像组成的数据集上训练了两个深度卷积神经网络(DCNN),即MobileNetV2和ResNet50。使用包括受试者操作特征曲线(ROC)分析、准确率、敏感性和特异性等各种指标,将DCNN模型的性能与放射科医生的性能进行比较。采用德龙检验比较曲线下面积(AUC)。进行单因素和多因素分析,以探索与分类器性能相关的潜在因素。
共纳入532例PFCD患者和522例CGF患者。两个预训练的DCNN分类器在内部测试队列(MobileNetV2 AUC:0.962,95%CI 0.903 - 0.990;ResNet50 AUC:0.963,95%CI 0.905 - 0.990)以及外部测试队列[MobileNetV2 AUC:0.885,95%CI 0.769 - 0.956;ResNet50 AUC:0.874,95%CI 0.756 - 0.949]中均表现出令人鼓舞的性能。它们的AUC高于放射科医生(所有p≤0.001),而彼此之间的AUC相当(两个测试队列中p分别为0.83和0.60)。在病因诊断中,没有任何潜在特征对预训练的MobileNetV2分类器的性能有显著影响。既往瘘管手术影响了预训练的ResNet50分类器在内部测试队列中的性能(OR 0.157,95%CI 0.025 - 0.997,p = 0.05)。
与人工视觉评估相比,所开发的DCNN分类器在区分PFCD和CGF方面表现出更强的稳健性,显示出其在辅助PFCD早期检测方面的潜力。我们的研究结果突出了MobileNetV2相对于ResNet50更有前景的泛化性能,使其适合在移动终端上部署。
中国国家自然科学基金。