• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

机器学习在疾病预测与管理中分析真实世界数据的应用:系统评价

The Use of Machine Learning for Analyzing Real-World Data in Disease Prediction and Management: Systematic Review.

作者信息

Alhumaidi Norah Hamad, Dermawan Doni, Kamaruzaman Hanin Farhana, Alotaiq Nasser

机构信息

College of Medicine, Qassim University, Buraidah, Saudi Arabia.

Applied Biotechnology, Faculty of Chemistry, Warsaw University of Technology, Warsaw, Poland.

出版信息

JMIR Med Inform. 2025 Jun 19;13:e68898. doi: 10.2196/68898.

DOI:10.2196/68898
PMID:40537090
Abstract

BACKGROUND

Machine learning (ML) and big data analytics are rapidly transforming health care, particularly disease prediction, management, and personalized care. With the increasing availability of real-world data (RWD) from diverse sources, such as electronic health records (EHRs), patient registries, and wearable devices, ML techniques present substantial potential to enhance clinical outcomes. Despite this promise, challenges such as data quality, model transparency, generalizability, and integration into clinical practice persist.

OBJECTIVE

This systematic review aims to examine the use of ML for analyzing RWD in disease prediction and management, identifying the most commonly used ML methods, prevalent disease types, study designs, and the sources of real-world evidence (RWE). It also explores the strengths and limitations of current practices, offering insights for future improvements.

METHODS

A comprehensive search was conducted following the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines to identify studies using ML techniques for analyzing RWD in disease prediction and management. The search focused on extracting data regarding the ML algorithms applied; disease categories studied; types of study designs (eg, clinical trials and cohort studies); and the sources of RWE, including EHRs, patient registries, and wearable devices. Studies published between 2014 and 2024 were included to ensure the analysis of the most recent advances in the field.

RESULTS

This review identified 57 studies that met the inclusion criteria, with a total sample size of >150,000 patients. The most frequently applied ML methods were random forest (n=24, 42%), logistic regression (n=21, 37%), and support vector machines (n=18, 32%). These methods were predominantly used for predictive modeling across disease areas, including cardiovascular diseases (n=19, 33%), cancer (n=9, 16%), and neurological disorders (n=6, 11%). RWE was primarily sourced from EHRs, patient registries, and wearable devices. A substantial portion of studies (n=38, 67%) focused on improving clinical decision-making, patient stratification, and treatment optimization. Among these studies, 14 (25%) focused on decision-making; 12 (21%) on health care outcomes, such as quality of life, recovery rates, and adverse events; and 11 (19%) on survival prediction, particularly in oncology and chronic diseases. For example, random forest models for cardiovascular disease prediction demonstrated an area under the curve of 0.85 (95% CI 0.81-0.89), while support vector machine models for cancer prognosis achieved an accuracy of 83% (P=.04). Despite the promising outcomes, many (n=34, 60%) studies faced challenges related to data quality, model interpretability, and ensuring generalizability across diverse patient populations.

CONCLUSIONS

This systematic review highlights the significant potential of ML and big data analytics in health care, especially for improving disease prediction and management. However, to fully realize the benefits of these technologies, future research must focus on addressing the challenges of data quality, enhancing model transparency, and ensuring the broader applicability of ML models across diverse populations and clinical settings.

摘要

背景

机器学习(ML)和大数据分析正在迅速改变医疗保健领域,尤其是疾病预测、管理和个性化医疗。随着来自电子健康记录(EHR)、患者登记册和可穿戴设备等不同来源的真实世界数据(RWD)越来越容易获取,ML技术在改善临床结果方面具有巨大潜力。尽管有此前景,但数据质量、模型透明度、可推广性以及融入临床实践等挑战依然存在。

目的

本系统评价旨在研究ML在疾病预测和管理中分析RWD的应用,确定最常用的ML方法、常见疾病类型、研究设计以及真实世界证据(RWE)的来源。它还探讨了当前实践的优势和局限性,为未来的改进提供见解。

方法

按照PRISMA(系统评价和Meta分析的首选报告项目)指南进行全面检索,以确定使用ML技术分析疾病预测和管理中的RWD的研究。检索重点是提取有关应用的ML算法、研究的疾病类别、研究设计类型(如临床试验和队列研究)以及RWE来源(包括EHR、患者登记册和可穿戴设备)的数据。纳入2014年至2024年间发表的研究,以确保对该领域的最新进展进行分析。

结果

本评价确定了57项符合纳入标准的研究,总样本量超过150,000名患者。最常应用的ML方法是随机森林(n = 24,42%)、逻辑回归(n = 21,37%)和支持向量机(n = 18,32%)。这些方法主要用于跨疾病领域的预测建模,包括心血管疾病(n = 19,33%)、癌症(n = 9,16%)和神经系统疾病(n = 6,11%)。RWE主要来源于EHR、患者登记册和可穿戴设备。相当一部分研究(n = 38,67%)专注于改善临床决策、患者分层和治疗优化。在这些研究中,14项(25%)专注于决策;12项(21%)关注医疗保健结果,如生活质量、康复率和不良事件;11项(19%)关注生存预测,特别是在肿瘤学和慢性病方面。例如,用于心血管疾病预测的随机森林模型的曲线下面积为0.85(95%CI 0.81 - 0.89),而用于癌症预后的支持向量机模型的准确率为83%(P = 0.04)。尽管结果很有前景,但许多研究(n = 34,60%)面临与数据质量、模型可解释性以及确保在不同患者群体中的可推广性相关的挑战。

结论

本系统评价突出了ML和大数据分析在医疗保健中的巨大潜力,特别是在改善疾病预测和管理方面。然而,为了充分实现这些技术的益处,未来的研究必须专注于应对数据质量挑战、提高模型透明度以及确保ML模型在不同人群和临床环境中的更广泛适用性。

相似文献

1
The Use of Machine Learning for Analyzing Real-World Data in Disease Prediction and Management: Systematic Review.机器学习在疾病预测与管理中分析真实世界数据的应用:系统评价
JMIR Med Inform. 2025 Jun 19;13:e68898. doi: 10.2196/68898.
2
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.利用预后信息为乳腺癌患者选择辅助性全身治疗的成本效益
Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.
3
A Systematic Review and Bibliometric Analysis of Applications of Artificial Intelligence and Machine Learning in Vascular Surgery.人工智能和机器学习在血管外科应用的系统评价与文献计量分析
Ann Vasc Surg. 2022 Sep;85:395-405. doi: 10.1016/j.avsg.2022.03.019. Epub 2022 Mar 24.
4
Health professionals' experience of teamwork education in acute hospital settings: a systematic review of qualitative literature.医疗专业人员在急症医院环境中团队合作教育的经验:对定性文献的系统综述
JBI Database System Rev Implement Rep. 2016 Apr;14(4):96-137. doi: 10.11124/JBISRIR-2016-1843.
5
Home treatment for mental health problems: a systematic review.心理健康问题的居家治疗:一项系统综述
Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.
6
Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.社区居住的老年人跌倒预防干预措施:系统评价和荟萃分析的益处、危害以及患者的价值观和偏好。
Syst Rev. 2024 Nov 26;13(1):289. doi: 10.1186/s13643-024-02681-3.
7
The clinical effectiveness and cost-effectiveness of technologies used to visualise the seizure focus in people with refractory epilepsy being considered for surgery: a systematic review and decision-analytical model.用于可视化耐药性癫痫患者手术候选者致痫灶的技术的临床有效性和成本效益:系统评价和决策分析模型。
Health Technol Assess. 2012;16(34):1-157, iii-iv. doi: 10.3310/hta16340.
8
Evaluation of machine learning-based models for prediction of clinical deterioration: A systematic literature review.基于机器学习的临床恶化预测模型评估:系统文献回顾。
Int J Med Inform. 2023 Jul;175:105084. doi: 10.1016/j.ijmedinf.2023.105084. Epub 2023 Apr 25.
9
Proposal for Using AI to Assess Clinical Data Integrity and Generate Metadata: Algorithm Development and Validation.关于使用人工智能评估临床数据完整性并生成元数据的提案:算法开发与验证
JMIR Med Inform. 2025 Jun 30;13:e60204. doi: 10.2196/60204.
10
The Machine Learning Models in Major Cardiovascular Adverse Events Prediction Based on Coronary Computed Tomography Angiography: Systematic Review.基于冠状动脉计算机断层扫描血管造影术的主要心血管不良事件预测中的机器学习模型:系统评价
J Med Internet Res. 2025 Jun 13;27:e68872. doi: 10.2196/68872.

引用本文的文献

1
AI-NLME: A New Artificial Intelligence-Driven Nonlinear Mixed Effect Modeling Approach for Analyzing Longitudinal Data in Randomized Placebo-Controlled Clinical Trials.AI-NLME:一种用于分析随机安慰剂对照临床试验中纵向数据的新型人工智能驱动的非线性混合效应建模方法。
Clin Transl Sci. 2025 Sep;18(9):e70345. doi: 10.1111/cts.70345.
2
Associations Between Thyroid Function and Periodontitis: A Machine Learning Approach Using NHANES.甲状腺功能与牙周炎之间的关联:一种使用美国国家健康与营养检查调查(NHANES)的机器学习方法
Int Dent J. 2025 Jul 23;75(5):100921. doi: 10.1016/j.identj.2025.100921.

本文引用的文献

1
Advancements in Virtual Bioequivalence: A Systematic Review of Computational Methods and Regulatory Perspectives in the Pharmaceutical Industry.虚拟生物等效性的进展:制药行业计算方法与监管视角的系统综述
Pharmaceutics. 2024 Nov 3;16(11):1414. doi: 10.3390/pharmaceutics16111414.
2
Development of a Natural Language Processing (NLP) model to automatically extract clinical data from electronic health records: results from an Italian comprehensive stroke center.开发一种自然语言处理 (NLP) 模型,以自动从电子健康记录中提取临床数据:来自意大利综合卒中中心的结果。
Int J Med Inform. 2024 Dec;192:105626. doi: 10.1016/j.ijmedinf.2024.105626. Epub 2024 Sep 19.
3
Patient Autonomy in Medical Education: Navigating Ethical Challenges in the Age of Artificial Intelligence.
患者自主权在医学教育中的体现:人工智能时代的伦理挑战应对之道。
Inquiry. 2024 Jan-Dec;61:469580241266364. doi: 10.1177/00469580241266364.
4
Using real-world electronic health record data to predict the development of 12 cancer-related symptoms in the context of multimorbidity.利用真实世界的电子健康记录数据预测多病共存情况下12种癌症相关症状的发生发展。
JAMIA Open. 2024 Sep 12;7(3):ooae082. doi: 10.1093/jamiaopen/ooae082. eCollection 2024 Oct.
5
Emerging research trends in artificial intelligence for cancer diagnostic systems: A comprehensive review.癌症诊断系统人工智能的新兴研究趋势:全面综述
Heliyon. 2024 Aug 23;10(17):e36743. doi: 10.1016/j.heliyon.2024.e36743. eCollection 2024 Sep 15.
6
Natural Language Processing in medicine and ophthalmology: A review for the 21st-century clinician.医学和眼科学中的自然语言处理:21 世纪临床医生的综述。
Asia Pac J Ophthalmol (Phila). 2024 Jul-Aug;13(4):100084. doi: 10.1016/j.apjo.2024.100084. Epub 2024 Jul 25.
7
Medical-informed machine learning: integrating prior knowledge into medical decision systems.医学信息机器学习:将先验知识集成到医学决策系统中。
BMC Med Inform Decis Mak. 2024 Jun 28;24(Suppl 4):186. doi: 10.1186/s12911-024-02582-4.
8
The Sociodemographic Biases in Machine Learning Algorithms: A Biomedical Informatics Perspective.机器学习算法中的社会人口统计学偏差:生物医学信息学视角
Life (Basel). 2024 May 21;14(6):652. doi: 10.3390/life14060652.
9
Real-time machine learning-assisted sepsis alert enhances the timeliness of antibiotic administration and diagnostic accuracy in emergency department patients with sepsis: a cluster-randomized trial.实时机器学习辅助脓毒症预警可提高急诊科脓毒症患者抗生素使用的及时性和诊断准确性:一项集群随机试验。
Intern Emerg Med. 2024 Aug;19(5):1415-1424. doi: 10.1007/s11739-024-03535-5. Epub 2024 Feb 21.
10
An Overview of Introductory and Advanced Survival Analysis Methods in Clinical Applications: Where Have we Come so far?临床应用中初级和高级生存分析方法概述:我们已经走了多远?
Anticancer Res. 2024 Feb;44(2):471-487. doi: 10.21873/anticanres.16835.