基于机器学习算法和综合体检数据预测未来胃癌风险：一项病例对照研究。

Prediction of future gastric cancer risk using a machine learning algorithm and comprehensive medical check-up data: A case-control study.

机构信息

Faculty of Informatics and Engineering, The University of Electro-Communications, Tokyo, Japan.

Department of General Medicine, School of Medicine, Juntendo University, Tokyo, Japan.

出版信息

Sci Rep. 2019 Aug 27;9(1):12384. doi: 10.1038/s41598-019-48769-y.

DOI:10.1038/s41598-019-48769-y

PMID:31455831

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6712020/

Abstract

A comprehensive screening method using machine learning and many factors (biological characteristics, Helicobacter pylori infection status, endoscopic findings and blood test results), accumulated daily as data in hospitals, could improve the accuracy of screening to classify patients at high or low risk of developing gastric cancer. We used XGBoost, a classification method known for achieving numerous winning solutions in data analysis competitions, to capture nonlinear relations among many input variables and outcomes using the boosting approach to machine learning. Longitudinal and comprehensive medical check-up data were collected from 25,942 participants who underwent multiple endoscopies from 2006 to 2017 at a single facility in Japan. The participants were classified into a case group (y = 1) or a control group (y = 0) if gastric cancer was or was not detected, respectively, during a 122-month period. Among 1,431 total participants (89 cases and 1,342 controls), 1,144 (80%) were randomly selected for use in training 10 classification models; the remaining 287 (20%) were used to evaluate the models. The results showed that XGBoost outperformed logistic regression and showed the highest area under the curve value (0.899). Accumulating more data in the facility and performing further analyses including other input variables may help expand the clinical utility.

摘要

一种使用机器学习和多种因素（生物学特征、幽门螺杆菌感染状况、内镜检查结果和血液检查结果）的综合筛查方法，将这些因素作为数据在医院中逐日积累，可提高筛查的准确性，以对发生胃癌风险较高或较低的患者进行分类。我们使用 XGBoost 这种分类方法，该方法在数据分析竞赛中多次获得优胜解决方案，通过机器学习的提升方法捕捉许多输入变量和结果之间的非线性关系。从日本一家医疗机构 2006 年至 2017 年期间接受多次内镜检查的 25942 名参与者中收集了纵向和综合的体检数据。如果在 122 个月的时间内检测到胃癌，则将参与者归入病例组（y=1），否则归入对照组（y=0）。在 1431 名总参与者（89 例和 1342 例对照）中，随机选择 1144 名（80%）用于训练 10 个分类模型；其余 287 名（20%）用于评估模型。结果表明，XGBoost 优于逻辑回归，显示出最高的曲线下面积值（0.899）。在医疗机构中积累更多数据并进行包括其他输入变量在内的进一步分析，可能有助于扩大其临床应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fe/6712020/34144d616724/41598_2019_48769_Fig1_HTML.jpg

相似文献

Prediction of future gastric cancer risk using a machine learning algorithm and comprehensive medical check-up data: A case-control study.基于机器学习算法和综合体检数据预测未来胃癌风险：一项病例对照研究。

Sci Rep. 2019 Aug 27;9(1):12384. doi: 10.1038/s41598-019-48769-y.

Prediction of Glucose Metabolism Disorder Risk Using a Machine Learning Algorithm: Pilot Study.使用机器学习算法预测葡萄糖代谢紊乱风险：初步研究。

JMIR Diabetes. 2018 Nov 26;3(4):e10212. doi: 10.2196/10212.

Using machine-learning approaches to predict non-participation in a nationwide general health check-up scheme.使用机器学习方法预测全国性一般健康检查计划的不参与情况。

Comput Methods Programs Biomed. 2018 Sep;163:39-46. doi: 10.1016/j.cmpb.2018.05.032. Epub 2018 May 29.

[Comparison of machine learning method and logistic regression model in prediction of acute kidney injury in severely burned patients].[机器学习方法与逻辑回归模型在预测重度烧伤患者急性肾损伤中的比较]

Zhonghua Shao Shang Za Zhi. 2018 Jun 20;34(6):343-348. doi: 10.3760/cma.j.issn.1009-2587.2018.06.006.

Predicting urinary tract infections in the emergency department with machine learning.利用机器学习预测急诊科的尿路感染。

PLoS One. 2018 Mar 7;13(3):e0194085. doi: 10.1371/journal.pone.0194085. eCollection 2018.

Emergency department triage prediction of clinical outcomes using machine learning models.运用机器学习模型对急诊科患者临床结局进行分诊预测。

Crit Care. 2019 Feb 22;23(1):64. doi: 10.1186/s13054-019-2351-7.

A data-driven approach to predicting diabetes and cardiovascular disease with machine learning.基于机器学习的数据驱动方法预测糖尿病和心血管疾病。

BMC Med Inform Decis Mak. 2019 Nov 6;19(1):211. doi: 10.1186/s12911-019-0918-5.

Prediction of In-hospital Mortality in Emergency Department Patients With Sepsis: A Local Big Data-Driven, Machine Learning Approach.急诊科脓毒症患者院内死亡率的预测：一种基于本地大数据驱动的机器学习方法。

Acad Emerg Med. 2016 Mar;23(3):269-78. doi: 10.1111/acem.12876. Epub 2016 Feb 13.

Impact of Machine Learning With Multiparametric Magnetic Resonance Imaging of the Breast for Early Prediction of Response to Neoadjuvant Chemotherapy and Survival Outcomes in Breast Cancer Patients.机器学习联合乳腺多参数磁共振成像对乳腺癌新辅助化疗早期疗效及生存预后评估的影响。

Invest Radiol. 2019 Feb;54(2):110-117. doi: 10.1097/RLI.0000000000000518.

Fetal health status prediction based on maternal clinical history using machine learning techniques.基于机器学习技术的基于产妇临床史的胎儿健康状况预测。

Comput Methods Programs Biomed. 2018 Sep;163:87-100. doi: 10.1016/j.cmpb.2018.06.010. Epub 2018 Jun 14.

引用本文的文献

A Simple, Interpretable Machine Learning Model Based on Clinical Factors Accurately Predicts Incident Dysplasia or Malignancy in Barrett's Esophagus.一种基于临床因素的简单、可解释的机器学习模型能够准确预测巴雷特食管的异型增生或恶性病变。

Dig Dis Sci. 2025 Apr 28. doi: 10.1007/s10620-025-09069-w.

Machine Learning to Predict Early Death Despite Pancreaticoduodenectomy.机器学习预测胰十二指肠切除术后早期死亡情况

J Surg Res. 2025 Jun;310:186-193. doi: 10.1016/j.jss.2025.03.047. Epub 2025 Apr 26.

The application of artificial intelligence in upper gastrointestinal cancers.人工智能在上消化道癌症中的应用。

J Natl Cancer Cent. 2024 Dec 27;5(2):113-131. doi: 10.1016/j.jncc.2024.12.006. eCollection 2025 Apr.

Predicting risk factors for Epstein-Barr virus reactivation using Bayesian network analysis: a population-based study of high-risk areas for nasopharyngeal cancer.使用贝叶斯网络分析预测爱泼斯坦-巴尔病毒再激活的风险因素：一项基于人群的鼻咽癌高危地区研究。

Front Oncol. 2025 Jan 21;14:1369765. doi: 10.3389/fonc.2024.1369765. eCollection 2024.

A Machine Learning Risk Prediction Model for Gastric Cancer with SHapley Additive exPlanations.一种基于SHapley值加法解释的胃癌机器学习风险预测模型。

Cancer Res Treat. 2024 Dec 16. doi: 10.4143/crt.2024.843.

Empowering cancer prevention with AI: unlocking new frontiers in prediction, diagnosis, and intervention.利用人工智能助力癌症预防：开启预测、诊断和干预的新前沿。

Cancer Causes Control. 2025 Apr;36(4):353-367. doi: 10.1007/s10552-024-01942-9. Epub 2024 Dec 13.

Explainable machine learning model for predicting paratracheal lymph node metastasis in cN0 papillary thyroid cancer.用于预测 cN0 期甲状腺乳头状癌气管旁淋巴结转移的可解释机器学习模型。

Sci Rep. 2024 Sep 27;14(1):22361. doi: 10.1038/s41598-024-73837-3.

Using the Electronic Health Record to Develop a Gastric Cancer Risk Prediction Model.利用电子健康记录开发胃癌风险预测模型。

Gastro Hep Adv. 2024 Jul 14;3(7):910-916. doi: 10.1016/j.gastha.2024.07.001. eCollection 2024.

Health expenditure trajectory and gastric cancer incidence in the National Health Insurance Senior Cohort: a nested case-control study.健康支出轨迹与国民健康保险老年队列胃癌发病率：一项嵌套病例对照研究。

BMC Health Serv Res. 2024 Sep 16;24(1):1076. doi: 10.1186/s12913-024-11494-y.

Risk Prediction Models for Gastric Cancer: A Scoping Review.胃癌风险预测模型：一项范围综述

J Multidiscip Healthc. 2024 Sep 6;17:4337-4352. doi: 10.2147/JMDH.S479699. eCollection 2024.

本文引用的文献

Towards rapid prediction of drug-resistant cancer cell phenotypes: single cell mass spectrometry combined with machine learning.迈向耐药癌细胞表型的快速预测：单细胞质谱联用机器学习。

Chem Commun (Camb). 2019 Jan 10;55(5):616-619. doi: 10.1039/c8cc08296k.

Prediction of Glucose Metabolism Disorder Risk Using a Machine Learning Algorithm: Pilot Study.使用机器学习算法预测葡萄糖代谢紊乱风险：初步研究。

JMIR Diabetes. 2018 Nov 26;3(4):e10212. doi: 10.2196/10212.

Low lymphocyte count and high monocyte count predicts poor prognosis of gastric cancer.淋巴细胞计数低和单核细胞计数高预示着胃癌的预后不良。

BMC Gastroenterol. 2018 Oct 11;18(1):148. doi: 10.1186/s12876-018-0877-9.

Model-based and Model-free Machine Learning Techniques for Diagnostic Prediction and Classification of Clinical Outcomes in Parkinson's Disease.基于模型和无模型机器学习技术在帕金森病临床结局的诊断预测和分类中的应用。

Sci Rep. 2018 May 8;8(1):7129. doi: 10.1038/s41598-018-24783-4.

Using Machine Learning Approaches for Emergency Room Visit Prediction Based on Electronic Health Record Data.基于电子健康记录数据，使用机器学习方法进行急诊室就诊预测。

Stud Health Technol Inform. 2018;247:111-115.

Computer-aided diagnosis of lung nodule using gradient tree boosting and Bayesian optimization.基于梯度提升树和贝叶斯优化的肺结节计算机辅助诊断。

PLoS One. 2018 Apr 19;13(4):e0195875. doi: 10.1371/journal.pone.0195875. eCollection 2018.

Deep learning analyzes Helicobacter pylori infection by upper gastrointestinal endoscopy images.深度学习通过上消化道内窥镜图像分析幽门螺杆菌感染。

Endosc Int Open. 2018 Feb;6(2):E139-E144. doi: 10.1055/s-0043-120830. Epub 2018 Feb 1.

Metabolomics biomarkers to predict acamprosate treatment response in alcohol-dependent subjects.代谢组学生物标志物预测酒精依赖患者对安非他酮治疗的反应。

Sci Rep. 2017 May 31;7(1):2496. doi: 10.1038/s41598-017-02442-4.

Pretreatment neutrophil-lymphocyte count ratio may associate with gastric cancer presence.治疗前中性粒细胞与淋巴细胞计数比值可能与胃癌的存在有关。

Cancer Biomark. 2016 Mar 4;16(4):523-8. doi: 10.3233/CBM-160593.

Screening and surveillance for gastric cancer in the United States: Is it needed?美国胃癌的筛查与监测：有必要吗？

Gastrointest Endosc. 2016 Jul;84(1):18-28. doi: 10.1016/j.gie.2016.02.028. Epub 2016 Mar 3.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于机器学习算法和综合体检数据预测未来胃癌风险：一项病例对照研究。

Prediction of future gastric cancer risk using a machine learning algorithm and comprehensive medical check-up data: A case-control study.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献