Faculty of Health Sciences, Department of Pharmacy, UiT-The Arctic University of Norway, Tromsø, Norway.
Faculty of Science and Technology, Department of Computer Science, UiT-The Arctic University of Norway, Tromsø, Norway.
PLoS One. 2024 Aug 23;19(8):e0309175. doi: 10.1371/journal.pone.0309175. eCollection 2024.
In this review, we investigated how Machine Learning (ML) was utilized to predict all-cause somatic hospital admissions and readmissions in adults.
We searched eight databases (PubMed, Embase, Web of Science, CINAHL, ProQuest, OpenGrey, WorldCat, and MedNar) from their inception date to October 2023, and included records that predicted all-cause somatic hospital admissions and readmissions of adults using ML methodology. We used the CHARMS checklist for data extraction, PROBAST for bias and applicability assessment, and TRIPOD for reporting quality.
We screened 7,543 studies of which 163 full-text records were read and 116 met the review inclusion criteria. Among these, 45 predicted admission, 70 predicted readmission, and one study predicted both. There was a substantial variety in the types of datasets, algorithms, features, data preprocessing steps, evaluation, and validation methods. The most used types of features were demographics, diagnoses, vital signs, and laboratory tests. Area Under the ROC curve (AUC) was the most used evaluation metric. Models trained using boosting tree-based algorithms often performed better compared to others. ML algorithms commonly outperformed traditional regression techniques. Sixteen studies used Natural language processing (NLP) of clinical notes for prediction, all studies yielded good results. The overall adherence to reporting quality was poor in the review studies. Only five percent of models were implemented in clinical practice. The most frequently inadequately addressed methodological aspects were: providing model interpretations on the individual patient level, full code availability, performing external validation, calibrating models, and handling class imbalance.
This review has identified considerable concerns regarding methodological issues and reporting quality in studies investigating ML to predict hospitalizations. To ensure the acceptability of these models in clinical settings, it is crucial to improve the quality of future studies.
本综述旨在调查机器学习(ML)在预测成人全因躯体住院和再入院中的应用。
我们从各数据库(PubMed、Embase、Web of Science、CINAHL、ProQuest、OpenGrey、WorldCat 和 MedNar)的创建日期起至 2023 年 10 月进行了检索,纳入了使用 ML 方法预测成人全因躯体住院和再入院的研究。我们使用 CHARMS 清单进行数据提取,PROBAST 进行偏倚和适用性评估,TRIPOD 进行报告质量评估。
我们筛选了 7543 项研究,其中 163 篇全文记录被阅读,116 篇符合综述纳入标准。其中,45 项研究预测入院,70 项研究预测再入院,1 项研究同时预测了两者。数据集、算法、特征、数据预处理步骤、评估和验证方法的类型存在很大差异。最常用的特征类型是人口统计学、诊断、生命体征和实验室检查。ROC 曲线下面积(AUC)是最常用的评估指标。基于提升树的算法训练的模型通常比其他模型表现更好。ML 算法通常优于传统回归技术。16 项研究使用自然语言处理(NLP)对临床记录进行预测,所有研究均取得了良好的结果。综述研究的报告质量总体上较差。只有 5%的模型在临床实践中得到应用。方法学方面最常被忽视的问题包括:在个体患者水平上提供模型解释、提供完整代码、进行外部验证、校准模型和处理类别不平衡。
本综述发现,在使用 ML 预测住院的研究中,存在较多关于方法学问题和报告质量的担忧。为了确保这些模型在临床环境中的可接受性,提高未来研究的质量至关重要。