Department of Diagnostic Imaging, Sheba Medical Center, Tel Hashomer, Israel; Sackler Medical School, Tel Aviv University, Tel Aviv, Israel; DeepVision Lab, Sheba Medical Center, Tel Hashomer, Israel.
DeepVision Lab, Sheba Medical Center, Tel Hashomer, Israel.
Gastrointest Endosc. 2021 Jan;93(1):187-192. doi: 10.1016/j.gie.2020.05.066. Epub 2020 Jun 12.
Capsule endoscopy (CE) is an important modality for diagnosis and follow-up of Crohn's disease (CD). The severity of ulcers at endoscopy is significant for predicting the course of CD. Deep learning has been proven accurate in detecting ulcers on CE. However, endoscopic classification of ulcers by deep learning has not been attempted. The aim of our study was to develop a deep learning algorithm for automated grading of CD ulcers on CE.
We retrospectively collected CE images of CD ulcers from our CE database. In experiment 1, the severity of each ulcer was graded by 2 capsule readers based on the PillCam CD classification (grades 1-3 from mild to severe), and the inter-reader variability was evaluated. In experiment 2, a consensus reading by 3 capsule readers was used to train an ordinal convolutional neural network (CNN) to automatically grade images of ulcers, and the resulting algorithm was tested against the consensus reading. A pretraining stage included training the network on images of normal mucosa and ulcerated mucosa.
Overall, our dataset included 17,640 CE images from 49 patients; 7391 images with mucosal ulcers and 10,249 normal images. A total of 2598 randomly selected pathologic images were further graded from 1 to 3 according to ulcer severity in the 2 different experiments. In experiment 1, overall inter-reader agreement occurred for 31% of the images (345 of 1108) and 76% (752 of 989) for distinction of grades 1 and 3. In experiment 2, the algorithm was trained on 1242 images. It achieved an overall agreement for consensus reading of 67% (166 of 248) and 91% (158 of 173) for distinction of grades 1 and 3. The classification accuracy of the algorithm was 0.91 (95% confidence interval, 0.867-0.954) for grade 1 versus grade 3 ulcers, 0.78 (95% confidence interval, 0.716-0.844) for grade 2 versus grade 3, and 0.624 (95% confidence interval, 0.547-0.701) for grade 1 versus grade 2.
CNN achieved high accuracy in detecting severe CD ulcerations. CNN-assisted CE readings in patients with CD can potentially facilitate and improve diagnosis and monitoring in these patients.
胶囊内镜(CE)是诊断和随访克罗恩病(CD)的重要手段。内镜下溃疡的严重程度对于预测 CD 的病程具有重要意义。深度学习已被证明在检测 CE 上的溃疡方面是准确的。然而,深度学习尚未尝试对溃疡进行内镜分类。本研究的目的是开发一种用于自动分级 CD 胶囊内镜下溃疡的深度学习算法。
我们从我们的 CE 数据库中回顾性地收集了 CD 溃疡的 CE 图像。在实验 1 中,根据 PillCam CD 分类(从轻度到重度的 1-3 级),由 2 位胶囊阅读器对每个溃疡的严重程度进行分级,并评估了读者间的可变性。在实验 2 中,使用 3 位胶囊阅读器的共识阅读来训练一个有序卷积神经网络(CNN),以自动对溃疡图像进行分级,并用共识阅读来测试由此产生的算法。预训练阶段包括在正常黏膜和溃疡性黏膜的图像上训练网络。
总的来说,我们的数据集包括 49 名患者的 17640 张 CE 图像;7391 张黏膜溃疡图像和 10249 张正常图像。总共随机选择了 2598 张病理图像,根据在两个不同实验中溃疡严重程度从 1 到 3 进行进一步分级。在实验 1 中,总体上读者间的一致性发生在 31%的图像(1108 张中的 345 张)和 76%(989 张中的 752 张)用于区分 1 级和 3 级。在实验 2 中,该算法在 1242 张图像上进行了训练。它对共识阅读的总体一致性为 67%(248 张中的 166 张)和 91%(173 张中的 158 张)用于区分 1 级和 3 级。算法对 1 级与 3 级溃疡的分类准确率为 0.91(95%置信区间,0.867-0.954),对 2 级与 3 级溃疡的准确率为 0.78(95%置信区间,0.716-0.844),对 1 级与 2 级溃疡的准确率为 0.624(95%置信区间,0.547-0.701)。
CNN 在检测严重 CD 溃疡方面具有很高的准确性。CNN 辅助 CD 患者的 CE 阅读有可能促进和改善这些患者的诊断和监测。