Shen Ming-Hung, Huang Chi-Cheng, Chen Yu-Tsung, Tsai Yi-Jian, Liou Fou-Ming, Chang Shih-Chang, Phan Nam Nhut
Department of Surgery, Fu Jen Catholic University Hospital, Fu Jen Catholic University, New Taipei City 24205, Taiwan.
School of Medicine, College of Medicine, Fu Jen Catholic University, New Taipei City 24205, Taiwan.
Diagnostics (Basel). 2023 Apr 19;13(8):1473. doi: 10.3390/diagnostics13081473.
The present study aimed to develop an AI-based system for the detection and classification of polyps using colonoscopy images. A total of about 256,220 colonoscopy images from 5000 colorectal cancer patients were collected and processed. We used the CNN model for polyp detection and the EfficientNet-b0 model for polyp classification. Data were partitioned into training, validation and testing sets, with a 70%, 15% and 15% ratio, respectively. After the model was trained/validated/tested, to evaluate its performance rigorously, we conducted a further external validation using both prospective ( = 150) and retrospective ( = 385) approaches for data collection from 3 hospitals. The deep learning model performance with the testing set reached a state-of-the-art sensitivity and specificity of 0.9709 (95% CI: 0.9646-0.9757) and 0.9701 (95% CI: 0.9663-0.9749), respectively, for polyp detection. The polyp classification model attained an AUC of 0.9989 (95% CI: 0.9954-1.00). The external validation from 3 hospital results achieved 0.9516 (95% CI: 0.9295-0.9670) with the lesion-based sensitivity and a frame-based specificity of 0.9720 (95% CI: 0.9713-0.9726) for polyp detection. The model achieved an AUC of 0.9521 (95% CI: 0.9308-0.9734) for polyp classification. The high-performance, deep-learning-based system could be used in clinical practice to facilitate rapid, efficient and reliable decisions by physicians and endoscopists.
本研究旨在开发一种基于人工智能的系统,用于使用结肠镜检查图像检测和分类息肉。共收集并处理了来自5000名结直肠癌患者的约256,220张结肠镜检查图像。我们使用卷积神经网络(CNN)模型进行息肉检测,使用EfficientNet-b0模型进行息肉分类。数据被划分为训练集、验证集和测试集,比例分别为70%、15%和15%。在对模型进行训练/验证/测试后,为了严格评估其性能,我们采用前瞻性(n = 150)和回顾性(n = 385)两种方法从3家医院收集数据进行进一步的外部验证。测试集的深度学习模型性能在息肉检测方面分别达到了0.9709(95%置信区间:0.9646 - 0.9757)和0.9701(95%置信区间:0.9663 - 0.9749)的先进灵敏度和特异性。息肉分类模型的曲线下面积(AUC)为0.9989(95%置信区间:0.9954 - 1.00)。来自3家医院的外部验证结果在基于病变的灵敏度方面达到了0.9516(95%置信区间:0.9295 - 0.9670),在基于帧的特异性方面达到了0.9720(95%置信区间:0.9713 - 0.9726)用于息肉检测。该模型在息肉分类方面的AUC为0.9521(95%置信区间:0.9308 - 0.9734)。这种高性能的基于深度学习的系统可用于临床实践,以帮助医生和内镜医师做出快速、高效且可靠的决策。