Suppr超能文献

基于机器学习算法和综合体检数据预测未来胃癌风险:一项病例对照研究。

Prediction of future gastric cancer risk using a machine learning algorithm and comprehensive medical check-up data: A case-control study.

机构信息

Faculty of Informatics and Engineering, The University of Electro-Communications, Tokyo, Japan.

Department of General Medicine, School of Medicine, Juntendo University, Tokyo, Japan.

出版信息

Sci Rep. 2019 Aug 27;9(1):12384. doi: 10.1038/s41598-019-48769-y.

Abstract

A comprehensive screening method using machine learning and many factors (biological characteristics, Helicobacter pylori infection status, endoscopic findings and blood test results), accumulated daily as data in hospitals, could improve the accuracy of screening to classify patients at high or low risk of developing gastric cancer. We used XGBoost, a classification method known for achieving numerous winning solutions in data analysis competitions, to capture nonlinear relations among many input variables and outcomes using the boosting approach to machine learning. Longitudinal and comprehensive medical check-up data were collected from 25,942 participants who underwent multiple endoscopies from 2006 to 2017 at a single facility in Japan. The participants were classified into a case group (y = 1) or a control group (y = 0) if gastric cancer was or was not detected, respectively, during a 122-month period. Among 1,431 total participants (89 cases and 1,342 controls), 1,144 (80%) were randomly selected for use in training 10 classification models; the remaining 287 (20%) were used to evaluate the models. The results showed that XGBoost outperformed logistic regression and showed the highest area under the curve value (0.899). Accumulating more data in the facility and performing further analyses including other input variables may help expand the clinical utility.

摘要

一种使用机器学习和多种因素(生物学特征、幽门螺杆菌感染状况、内镜检查结果和血液检查结果)的综合筛查方法,将这些因素作为数据在医院中逐日积累,可提高筛查的准确性,以对发生胃癌风险较高或较低的患者进行分类。我们使用 XGBoost 这种分类方法,该方法在数据分析竞赛中多次获得优胜解决方案,通过机器学习的提升方法捕捉许多输入变量和结果之间的非线性关系。从日本一家医疗机构 2006 年至 2017 年期间接受多次内镜检查的 25942 名参与者中收集了纵向和综合的体检数据。如果在 122 个月的时间内检测到胃癌,则将参与者归入病例组(y=1),否则归入对照组(y=0)。在 1431 名总参与者(89 例和 1342 例对照)中,随机选择 1144 名(80%)用于训练 10 个分类模型;其余 287 名(20%)用于评估模型。结果表明,XGBoost 优于逻辑回归,显示出最高的曲线下面积值(0.899)。在医疗机构中积累更多数据并进行包括其他输入变量在内的进一步分析,可能有助于扩大其临床应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fe/6712020/34144d616724/41598_2019_48769_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验