Suppr超能文献

基于机器学习的常规收集电子健康记录中的哮喘发作预测模型:系统综述

Machine Learning-Based Asthma Attack Prediction Models From Routinely Collected Electronic Health Records: Systematic Scoping Review.

作者信息

Budiarto Arif, Tsang Kevin C H, Wilson Andrew M, Sheikh Aziz, Shah Syed Ahmar

机构信息

Asthma UK Center for Applied Research, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom.

Bioinformatics and Data Science Research Center, Bina Nusantara University, Jakarta, Indonesia.

出版信息

JMIR AI. 2023 Dec 7;2:e46717. doi: 10.2196/46717.

Abstract

BACKGROUND

An early warning tool to predict attacks could enhance asthma management and reduce the likelihood of serious consequences. Electronic health records (EHRs) providing access to historical data about patients with asthma coupled with machine learning (ML) provide an opportunity to develop such a tool. Several studies have developed ML-based tools to predict asthma attacks.

OBJECTIVE

This study aims to critically evaluate ML-based models derived using EHRs for the prediction of asthma attacks.

METHODS

We systematically searched PubMed and Scopus (the search period was between January 1, 2012, and January 31, 2023) for papers meeting the following inclusion criteria: (1) used EHR data as the main data source, (2) used asthma attack as the outcome, and (3) compared ML-based prediction models' performance. We excluded non-English papers and nonresearch papers, such as commentary and systematic review papers. In addition, we also excluded papers that did not provide any details about the respective ML approach and its result, including protocol papers. The selected studies were then summarized across multiple dimensions including data preprocessing methods, ML algorithms, model validation, model explainability, and model implementation.

RESULTS

Overall, 17 papers were included at the end of the selection process. There was considerable heterogeneity in how asthma attacks were defined. Of the 17 studies, 8 (47%) studies used routinely collected data both from primary care and secondary care practices together. Extreme imbalanced data was a notable issue in most studies (13/17, 76%), but only 38% (5/13) of them explicitly dealt with it in their data preprocessing pipeline. The gradient boosting-based method was the best ML method in 59% (10/17) of the studies. Of the 17 studies, 14 (82%) studies used a model explanation method to identify the most important predictors. None of the studies followed the standard reporting guidelines, and none were prospectively validated.

CONCLUSIONS

Our review indicates that this research field is still underdeveloped, given the limited body of evidence, heterogeneity of methods, lack of external validation, and suboptimally reported models. We highlighted several technical challenges (class imbalance, external validation, model explanation, and adherence to reporting guidelines to aid reproducibility) that need to be addressed to make progress toward clinical adoption.

摘要

背景

一种预测哮喘发作的早期预警工具可以改善哮喘管理,并降低严重后果发生的可能性。电子健康记录(EHRs)能够提供哮喘患者的历史数据,结合机器学习(ML)为开发这样一种工具提供了契机。已有多项研究开发了基于机器学习的哮喘发作预测工具。

目的

本研究旨在严格评估利用电子健康记录推导的基于机器学习的模型对哮喘发作的预测能力。

方法

我们系统检索了PubMed和Scopus(检索时间段为2012年1月1日至2023年1月31日),查找符合以下纳入标准的论文:(1)以电子健康记录数据作为主要数据源;(2)以哮喘发作为结局;(3)比较基于机器学习的预测模型的性能。我们排除了非英文论文和非研究论文,如评论文章和系统评价论文。此外,我们还排除了未提供任何关于各自机器学习方法及其结果详细信息的论文,包括方案论文。然后从多个维度对所选研究进行总结,包括数据预处理方法、机器学习算法、模型验证、模型可解释性和模型实施。

结果

总体而言,在筛选过程结束时共纳入17篇论文。哮喘发作的定义存在相当大的异质性。在这17项研究中,8项(47%)研究同时使用了从初级保健和二级保健机构常规收集的数据。极端不平衡数据在大多数研究(13/17,76%)中是一个显著问题,但其中只有38%(5/13)在数据预处理流程中明确处理了该问题。在59%(10/17)的研究中,基于梯度提升的方法是最佳的机器学习方法。在这17项研究中,14项(82%)研究使用了模型解释方法来识别最重要的预测因素。没有一项研究遵循标准报告指南,也没有一项研究进行前瞻性验证。

结论

我们的综述表明,鉴于证据有限方法异质性、缺乏外部验证以及模型报告不够理想,该研究领域仍不发达。我们强调了几个需要解决的技术挑战(类别不平衡、外部验证、模型解释以及遵循报告指南以促进可重复性),以便在临床应用方面取得进展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/433b/11041490/231587cf924d/ai_v2i1e46717_fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验