Stanford University School of Medicine, Stanford, California, USA.
Eur J Gastroenterol Hepatol. 2021 May 1;33(5):645-649. doi: 10.1097/MEG.0000000000001952.
Previous reports of deep learning-assisted assessment of Mayo endoscopic subscore (MES) in ulcerative colitis have only explored the ability to distinguish disease remission (MES 0/1) from severe disease (MES 2/3) or inactive disease (MES 0) from active disease (MES 1-3). We sought to explore the utility of deep learning models in the automated grading of each individual MES in ulcerative colitis.
In this retrospective study, a total of 777 representative still images of endoscopies from 777 patients with clinically active ulcerative colitis were graded using the MES by two physicians. Each image was assigned an MES of 1, 2, or 3. A 101-layer convolutional neural network model was trained and validated on 90% of the data, while 10% was left for a holdout test set. Model discrimination was assessed by calculating the area under the curve (AUC) of a receiver operating characteristic as well as standard measures of accuracy, specificity, sensitivity, positive predictive value (PPV), and negative predictive value (NPV).
In the holdout test set, the final model classified MES 3 disease with an AUC of 0.96, MES 2 disease with an AUC of 0.86, and MES 1 disease with an AUC 0.89. Overall accuracy was 77.2%. Across MES 1, 2, and 3, average specificity was 85.7%, average sensitivity was 72.4%, average PPV was 77.7%, and the average NPV was 87.0%.
We have demonstrated a deep learning model was able to robustly classify individual grades of endoscopic disease severity among patients with ulcerative colitis.
先前关于深度学习辅助评估溃疡性结肠炎 Mayo 内镜评分(MES)的报告仅探索了区分疾病缓解(MES0/1)与严重疾病(MES2/3)或非活动疾病(MES0)与活动疾病(MES1-3)的能力。我们旨在探索深度学习模型在溃疡性结肠炎中自动分级每个个体 MES 的效用。
在这项回顾性研究中,两名医生使用 MES 对 777 例临床活动性溃疡性结肠炎患者的 777 个代表性内镜静止图像进行了分级。每张图像均被分配了 1、2 或 3 的 MES。在 90%的数据上训练和验证了一个 101 层卷积神经网络模型,而 10%的数据则保留用于验证测试集。通过计算受试者工作特征曲线下的面积(AUC)以及准确性、特异性、敏感性、阳性预测值(PPV)和阴性预测值(NPV)的标准衡量标准来评估模型的区分度。
在验证测试集中,最终模型对 MES3 疾病的分类 AUC 为 0.96,对 MES2 疾病的分类 AUC 为 0.86,对 MES1 疾病的分类 AUC 为 0.89。整体准确率为 77.2%。在 MES1、2 和 3 中,平均特异性为 85.7%,平均敏感性为 72.4%,平均 PPV 为 77.7%,平均 NPV 为 87.0%。
我们已经证明,深度学习模型能够稳健地对溃疡性结肠炎患者的内镜疾病严重程度进行个体分级。