Rubin David T, Gottlieb Klaus, Colombel Jean-Frederic, Schott Jean-Pierre, Erisson Lavi, Prucka Bill, Phillips Sloane Allebes, Kwon John, Ng Jonathan, McGill James
University of Chicago Medicine Inflammatory Bowel Disease Center, Gastroenterology, Chicago, Illinois.
Eli Lilly and Company, Immunology, Indianapolis, Indiana.
Gastro Hep Adv. 2023 Jun 17;2(7):935-942. doi: 10.1016/j.gastha.2023.06.003. eCollection 2023.
Endoscopic assessment is a co-primary end point in inflammatory bowel disease registration trials, yet it is subject to inter- and intraobserver variability. We present an original machine learning approach to Endoscopic Mayo Score (eMS) prediction in ulcerative colitis and report the model's performance in differentiating key levels of endoscopic disease activity on full-length procedure videos.
Seven hundred ninety-three full-length videos with centrally-read eMS were obtained from 249 patients with ulcerative colitis, who participated in a phase II trial evaluating mirikizumab (NCT02589665). A video annotation approach was established to extract mucosal features and associated eMS classification labels from each video to be used as inputs for model training. The primary objective of the model was a categorical prediction of inactive vs active endoscopic disease evaluated against 2 independent test sets: a full set with a baseline single human expert read and a consensus subset in which 2 human reads agreed.
On the full test set of 147 videos, the model predicted inactive vs active endoscopic disease via the eMS with an area under the curve of 89%, accuracy of 84%, positive predictive value of 80%, and negative predictive value of 85%. In the consensus test set of 94 videos, the model predicted inactive vs active endoscopic disease with an area under the curve of 92%, accuracy of 89%, positive predictive value of 87%, and negative predictive value of 90%.
We have demonstrated that this machine learning model supervised by mucosal features and eMS video annotations accurately differentiates key levels of endoscopic disease activity.
内镜评估是炎症性肠病注册试验的共同主要终点,但它存在观察者间和观察者内的变异性。我们提出了一种用于预测溃疡性结肠炎内镜梅奥评分(eMS)的原创机器学习方法,并报告了该模型在全长检查视频中区分内镜疾病活动关键水平的性能。
从249例溃疡性结肠炎患者中获取了793个经中心阅片得出eMS的全长视频,这些患者参与了一项评估mirikizumab的II期试验(NCT02589665)。建立了一种视频注释方法,从每个视频中提取黏膜特征和相关的eMS分类标签,用作模型训练的输入。该模型的主要目标是针对2个独立测试集对内镜疾病非活动期与活动期进行分类预测:一个是具有基线单人专家阅片的完整测试集,另一个是2位人员阅片结果一致的共识子集。
在147个视频的完整测试集中,该模型通过eMS预测内镜疾病非活动期与活动期,曲线下面积为89%,准确率为84%,阳性预测值为80%,阴性预测值为85%。在94个视频的共识测试集中,该模型预测内镜疾病非活动期与活动期,曲线下面积为92%,准确率为89%,阳性预测值为87%,阴性预测值为90%。
我们已经证明,这种由黏膜特征和eMS视频注释监督的机器学习模型能够准确区分内镜疾病活动的关键水平。