• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于公平学习健康系统的肺癌风险预测机器学习模型的开发:回顾性研究

Development of Lung Cancer Risk Prediction Machine Learning Models for Equitable Learning Health System: Retrospective Study.

作者信息

Chen Anjun, Wu Erman, Huang Ran, Shen Bairong, Han Ruobing, Wen Jian, Zhang Zhiyong, Li Qinghua

机构信息

School of Public Health, Guilin Medical University, Guilin, China.

West China Hospital, Chengdu, China.

出版信息

JMIR AI. 2024 Sep 11;3:e56590. doi: 10.2196/56590.

DOI:10.2196/56590
PMID:39259582
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11425024/
Abstract

BACKGROUND

A significant proportion of young at-risk patients and nonsmokers are excluded by the current guidelines for lung cancer (LC) screening, resulting in low-screening adoption. The vision of the US National Academy of Medicine to transform health systems into learning health systems (LHS) holds promise for bringing necessary structural changes to health care, thereby addressing the exclusivity and adoption issues of LC screening.

OBJECTIVE

This study aims to realize the LHS vision by designing an equitable, machine learning (ML)-enabled LHS unit for LC screening. It focuses on developing an inclusive and practical LC risk prediction model, suitable for initializing the ML-enabled LHS (ML-LHS) unit. This model aims to empower primary physicians in a clinical research network, linking central hospitals and rural clinics, to routinely deliver risk-based screening for enhancing LC early detection in broader populations.

METHODS

We created a standardized data set of health factors from 1397 patients with LC and 1448 control patients, all aged 30 years and older, including both smokers and nonsmokers, from a hospital's electronic medical record system. Initially, a data-centric ML approach was used to create inclusive ML models for risk prediction from all available health factors. Subsequently, a quantitative distribution of LC health factors was used in feature engineering to refine the models into a more practical model with fewer variables.

RESULTS

The initial inclusive 250-variable XGBoost model for LC risk prediction achieved performance metrics of 0.86 recall, 0.90 precision, and 0.89 accuracy. Post feature refinement, a practical 29-variable XGBoost model was developed, displaying performance metrics of 0.80 recall, 0.82 precision, and 0.82 accuracy. This model met the criteria for initializing the ML-LHS unit for risk-based, inclusive LC screening within clinical research networks.

CONCLUSIONS

This study designed an innovative ML-LHS unit for a clinical research network, aiming to sustainably provide inclusive LC screening to all at-risk populations. It developed an inclusive and practical XGBoost model from hospital electronic medical record data, capable of initializing such an ML-LHS unit for community and rural clinics. The anticipated deployment of this ML-LHS unit is expected to significantly improve LC-screening rates and early detection among broader populations, including those typically overlooked by existing screening guidelines.

摘要

背景

当前肺癌(LC)筛查指南将很大一部分高危年轻患者和非吸烟者排除在外,导致筛查的接受度较低。美国国家医学院将卫生系统转变为学习型卫生系统(LHS)的愿景有望给医疗保健带来必要的结构性变革,从而解决LC筛查的排他性和接受度问题。

目的

本研究旨在通过设计一个公平的、启用机器学习(ML)的用于LC筛查的LHS单元来实现LHS愿景。它专注于开发一个包容性强且实用的LC风险预测模型,适用于初始化启用ML的LHS(ML-LHS)单元。该模型旨在使临床研究网络(连接中心医院和农村诊所)中的初级医生能够常规地进行基于风险的筛查,以在更广泛人群中加强LC的早期检测。

方法

我们从一家医院的电子病历系统中创建了一个标准化数据集,包含1397例LC患者和1448例对照患者的健康因素,所有患者年龄均在30岁及以上,包括吸烟者和非吸烟者。最初,采用以数据为中心的ML方法,根据所有可用的健康因素创建用于风险预测的包容性ML模型。随后,在特征工程中使用LC健康因素的定量分布将模型优化为一个变量更少、更实用的模型。

结果

用于LC风险预测的初始包容性250变量XGBoost模型的性能指标为召回率0.86、精确率0.90和准确率0.89。经过特征优化后,开发了一个实用的29变量XGBoost模型,其性能指标为召回率0.80、精确率0.82和准确率0.82。该模型符合在临床研究网络中初始化用于基于风险的包容性LC筛查的ML-LHS单元的标准。

结论

本研究为临床研究网络设计了一个创新的ML-LHS单元,旨在可持续地为所有高危人群提供包容性LC筛查。它从医院电子病历数据中开发了一个包容性强且实用的XGBoost模型,能够为社区和农村诊所初始化这样一个ML-LHS单元。预计该ML-LHS单元的部署将显著提高更广泛人群(包括那些通常被现有筛查指南忽视的人群)的LC筛查率和早期检测率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9042/11425024/7ae70fac32b4/ai_v3i1e56590_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9042/11425024/90025288dbd2/ai_v3i1e56590_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9042/11425024/7353baa7cec2/ai_v3i1e56590_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9042/11425024/7ae70fac32b4/ai_v3i1e56590_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9042/11425024/90025288dbd2/ai_v3i1e56590_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9042/11425024/7353baa7cec2/ai_v3i1e56590_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9042/11425024/7ae70fac32b4/ai_v3i1e56590_fig3.jpg

相似文献

1
Development of Lung Cancer Risk Prediction Machine Learning Models for Equitable Learning Health System: Retrospective Study.用于公平学习健康系统的肺癌风险预测机器学习模型的开发:回顾性研究
JMIR AI. 2024 Sep 11;3:e56590. doi: 10.2196/56590.
2
Simulation of a machine learning enabled learning health system for risk prediction using synthetic patient data.使用合成患者数据模拟机器学习支持的学习健康系统进行风险预测。
Sci Rep. 2022 Oct 26;12(1):17917. doi: 10.1038/s41598-022-23011-4.
3
Implementing AI in Hospitals to Achieve a Learning Health System: Systematic Review of Current Enablers and Barriers.在医院中实施人工智能以实现学习型医疗体系:对当前推动因素和障碍的系统评价。
J Med Internet Res. 2024 Aug 2;26:e49655. doi: 10.2196/49655.
4
The future of Cochrane Neonatal.考克兰新生儿协作网的未来。
Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12.
5
Building Practical Risk Prediction Models for Nasopharyngeal Carcinoma Screening with Patient Graph Analysis and Machine Learning.基于患者图谱分析和机器学习构建用于鼻咽癌筛查的实用风险预测模型。
Cancer Epidemiol Biomarkers Prev. 2023 Feb 6;32(2):274-280. doi: 10.1158/1055-9965.EPI-22-0792.
6
Machine learning computational model to predict lung cancer using electronic medical records.机器学习计算模型,使用电子病历预测肺癌。
Cancer Epidemiol. 2024 Oct;92:102631. doi: 10.1016/j.canep.2024.102631. Epub 2024 Jul 24.
7
Deep Learning Using Chest Radiographs to Identify High-Risk Smokers for Lung Cancer Screening Computed Tomography: Development and Validation of a Prediction Model.利用胸部X光片进行深度学习以识别肺癌筛查计算机断层扫描的高危吸烟者:预测模型的开发与验证
Ann Intern Med. 2020 Nov 3;173(9):704-713. doi: 10.7326/M20-1868. Epub 2020 Sep 1.
8
A Machine Learning Algorithm Predicting Acute Kidney Injury in Intensive Care Unit Patients (NAVOY Acute Kidney Injury): Proof-of-Concept Study.一种预测重症监护病房患者急性肾损伤的机器学习算法(NAVOY急性肾损伤):概念验证研究。
JMIR Form Res. 2023 Dec 14;7:e45979. doi: 10.2196/45979.
9
Machine learning algorithms for predicting COVID-19 mortality in Ethiopia.用于预测埃塞俄比亚 COVID-19 死亡率的机器学习算法。
BMC Public Health. 2024 Jun 28;24(1):1728. doi: 10.1186/s12889-024-19196-0.
10
An Interpretable Longitudinal Preeclampsia Risk Prediction Using Machine Learning.一种使用机器学习的可解释性纵向子痫前期风险预测
medRxiv. 2023 Aug 16:2023.08.16.23293946. doi: 10.1101/2023.08.16.23293946.

引用本文的文献

1
Automated derivation of diagnostic criteria for lung cancer using natural language processing on electronic health records: a pilot study.利用电子健康记录中的自然语言处理自动推导肺癌诊断标准:一项试点研究。
BMC Med Inform Decis Mak. 2024 Dec 4;24(1):371. doi: 10.1186/s12911-024-02790-y.

本文引用的文献

1
To do no harm - and the most good - with AI in health care.在医疗保健领域利用人工智能做到无害且带来最大益处。
Nat Med. 2024 Mar;30(3):623-627. doi: 10.1038/s41591-024-02853-7.
2
Benchmarking the symptom-checking capabilities of ChatGPT for a broad range of diseases.对标 ChatGPT 在广泛疾病领域的症状自查能力。
J Am Med Inform Assoc. 2024 Sep 1;31(9):2084-2088. doi: 10.1093/jamia/ocad245.
3
Accuracy of a Generative Artificial Intelligence Model in a Complex Diagnostic Challenge.生成式人工智能模型在复杂诊断挑战中的准确性。
JAMA. 2023 Jul 3;330(1):78-80. doi: 10.1001/jama.2023.8288.
4
Building Practical Risk Prediction Models for Nasopharyngeal Carcinoma Screening with Patient Graph Analysis and Machine Learning.基于患者图谱分析和机器学习构建用于鼻咽癌筛查的实用风险预测模型。
Cancer Epidemiol Biomarkers Prev. 2023 Feb 6;32(2):274-280. doi: 10.1158/1055-9965.EPI-22-0792.
5
The Generation of a Lung Cancer Health Factor Distribution Using Patient Graphs Constructed From Electronic Medical Records: Retrospective Study.基于电子病历构建的患者图谱生成肺癌健康因素分布:回顾性研究。
J Med Internet Res. 2022 Nov 25;24(11):e40361. doi: 10.2196/40361.
6
Simulation of a machine learning enabled learning health system for risk prediction using synthetic patient data.使用合成患者数据模拟机器学习支持的学习健康系统进行风险预测。
Sci Rep. 2022 Oct 26;12(1):17917. doi: 10.1038/s41598-022-23011-4.
7
Leveraging electronic health records for data science: common pitfalls and how to avoid them.利用电子健康记录进行数据科学:常见陷阱及规避方法。
Lancet Digit Health. 2022 Dec;4(12):e893-e898. doi: 10.1016/S2589-7500(22)00154-6. Epub 2022 Sep 22.
8
Inaccuracies in electronic health records smoking data and a potential approach to address resulting underestimation in determining lung cancer screening eligibility.电子健康记录中吸烟数据的不准确性及解决由此导致的肺癌筛查资格确定中低估问题的潜在方法。
J Am Med Inform Assoc. 2022 Apr 13;29(5):779-788. doi: 10.1093/jamia/ocac020.
9
Artificial Intelligence-Based Prediction of Lung Cancer Risk Using Nonimaging Electronic Medical Records: Deep Learning Approach.基于人工智能的非影像电子病历预测肺癌风险:深度学习方法。
J Med Internet Res. 2021 Aug 3;23(8):e26256. doi: 10.2196/26256.
10
Electronic Health Records and Machine Learning for Early Detection of Lung Cancer and Other Conditions: Thinking about the Path Ahead.电子健康记录与机器学习用于肺癌及其他疾病的早期检测:展望未来之路
Am J Respir Crit Care Med. 2021 Aug 15;204(4):389-390. doi: 10.1164/rccm.202104-1009ED.