Department of Medicine, University of Hong Kong, Hong Kong, China.
Department of Surgery, University of Hong Kong, Hong Kong, China.
Aliment Pharmacol Ther. 2021 Apr;53(8):864-872. doi: 10.1111/apt.16272. Epub 2021 Jan 24.
The risk of gastric cancer after Helicobacter pylori (H. pylori) eradication remains unknown.
To evaluate the performances of seven different machine learning models in predicting gastric cancer risk after H. pylori eradication.
We identified H. pylori-infected patients who had received clarithromycin-based triple therapy between 2003 and 2014 in Hong Kong. Patients were divided into training (n = 64 238) and validation sets (n = 25 330), according to period of eradication therapy. The data were used to construct seven machine learning models to predict risk of gastric cancer development within 5 years after H. pylori eradication. A total of 26 clinical variables were input into these models. The performances were measured by the area under receiver operating characteristic curve (AUC) analysis.
During a mean follow-up of 4.7 years, 0.21% of H. pylori-eradicated patients developed gastric cancer. Of the seven machine learning models, extreme gradient boosting (XGBoost) had the best performance in predicting cancer development (AUC 0.97, 95%CI 0.96-0.98), and was superior to conventional logistic regression (AUC 0.90, 95% CI 0.84-0.92). With the XGBoost model, the number of patients considered at high risk of gastric cancer was 6.6%, with miss rate of 1.9%. Patient age, presence of intestinal metaplasia, and gastric ulcer were the heavily weighted factors used by the XGBoost.
Based on simple baseline patient information, machine learning model can accurately predict the risk of post-eradication gastric cancer. This model could substantially reduce the number of patients who require endoscopic surveillance.
幽门螺杆菌(H. pylori)根除后胃癌的风险尚不清楚。
评估七种不同机器学习模型在预测 H. pylori 根除后胃癌风险中的表现。
我们在香港确定了 2003 年至 2014 年间接受克拉霉素三联疗法治疗的 H. pylori 感染患者。根据根除治疗的时间,患者被分为训练集(n=64238)和验证集(n=25330)。使用这些数据构建了七种机器学习模型,以预测 H. pylori 根除后 5 年内胃癌发展的风险。总共输入了 26 个临床变量到这些模型中。通过接受者操作特征曲线(ROC)下面积(AUC)分析来衡量性能。
在平均 4.7 年的随访中,0.21%的 H. pylori 根除患者发生了胃癌。在七种机器学习模型中,极端梯度提升(XGBoost)在预测癌症发展方面表现最佳(AUC 0.97,95%CI 0.96-0.98),优于传统的逻辑回归(AUC 0.90,95%CI 0.84-0.92)。使用 XGBoost 模型,有 6.6%的患者被认为患有胃癌的风险较高,漏诊率为 1.9%。患者年龄、肠化生存在和胃溃疡是 XGBoost 中使用的重要因素。
基于简单的基线患者信息,机器学习模型可以准确预测根除后胃癌的风险。该模型可以大大减少需要内镜监测的患者数量。