Veterans Affairs Center for Clinical Management Research, Lieutenant Colonel Charles S. Kettles Veterans Affairs Medical Center, Ann Arbor, Michigan; Division of Gastroenterology, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, Michigan; Rogel Cancer Center, University of Michigan Medical School, Ann Arbor, Michigan; Institute for Healthcare Policy and Innovation, University of Michigan, Ann Arbor, Michigan.
Department of Statistics, University of Michigan College of Literature, Science, and Arts, Ann Arbor, Michigan.
Gastroenterology. 2023 Dec;165(6):1420-1429.e10. doi: 10.1053/j.gastro.2023.08.011. Epub 2023 Aug 18.
BACKGROUND & AIMS: Tools that can automatically predict incident esophageal adenocarcinoma (EAC) and gastric cardia adenocarcinoma (GCA) using electronic health records to guide screening decisions are needed.
The Veterans Health Administration (VHA) Corporate Data Warehouse was accessed to identify Veterans with 1 or more encounters between 2005 and 2018. Patients diagnosed with EAC (n = 8430) or GCA (n = 2965) were identified in the VHA Central Cancer Registry and compared with 10,256,887 controls. Predictors included demographic characteristics, prescriptions, laboratory results, and diagnoses between 1 and 5 years before the index date. The Kettles Esophageal and Cardia Adenocarcinoma predictioN (K-ECAN) tool was developed and internally validated using simple random sampling imputation and extreme gradient boosting, a machine learning method. Training was performed in 50% of the data, preliminary validation in 25% of the data, and final testing in 25% of the data.
K-ECAN was well-calibrated and had better discrimination (area under the receiver operating characteristic curve [AuROC], 0.77) than previously validated models, such as the Nord-Trøndelag Health Study (AuROC, 0.68) and Kunzmann model (AuROC, 0.64), or published guidelines. Using only data from between 3 and 5 years before index diminished its accuracy slightly (AuROC, 0.75). Undersampling men to simulate a non-VHA population, AUCs of the Nord-Trøndelag Health Study and Kunzmann model improved, but K-ECAN was still the most accurate (AuROC, 0.85). Although gastroesophageal reflux disease was strongly associated with EAC, it contributed only a small proportion of gain in information for prediction.
K-ECAN is a novel, internally validated tool predicting incident EAC and GCA using electronic health records data. Further work is needed to validate K-ECAN outside VHA and to assess how best to implement it within electronic health records.
需要能够使用电子健康记录自动预测食管腺癌(EAC)和胃贲门腺癌(GCA)事件的工具,以指导筛查决策。
访问退伍军人健康管理局(VHA)公司数据仓库,以确定 2005 年至 2018 年间有 1 次或多次就诊的退伍军人。在 VHA 中央癌症登记处确定 EAC(n=8430)或 GCA(n=2965)患者,并与 10256887 名对照进行比较。预测因素包括人口统计学特征、处方、实验室结果和索引日期前 1 至 5 年的诊断。使用简单随机抽样插补和极端梯度增强(一种机器学习方法)开发和内部验证 Kettles 食管和贲门腺癌预测工具(K-ECAN)。在数据的 50%上进行训练,在数据的 25%上进行初步验证,在数据的 25%上进行最终测试。
K-ECAN 具有良好的校准能力,并且比以前验证过的模型(如 Nord-Trøndelag 健康研究[AuROC,0.68]和 Kunzmann 模型[AuROC,0.64])或已发布的指南具有更好的区分能力(接受者操作特征曲线下的面积[AuROC],0.77)。仅使用索引前 3 至 5 年的数据会略微降低其准确性(AuROC,0.75)。模拟非 VHA 人群时对男性进行欠采样,可提高 Nord-Trøndelag 健康研究和 Kunzmann 模型的 AUC,但 K-ECAN 仍然是最准确的(AuROC,0.85)。尽管胃食管反流病与 EAC 密切相关,但它对预测的信息增益贡献很小。
K-ECAN 是一种使用电子健康记录数据预测食管腺癌和胃贲门腺癌事件的新型内部验证工具。需要进一步验证 K-ECAN 在 VHA 之外的有效性,并评估在电子健康记录中实施它的最佳方法。