Hu Qiaozhi, Li Jiafeng, Li Xiaoqi, Zou Dan, Xu Ting, He Zhiyao
Department of Pharmacy, West China Hospital, Sichuan University, Chengdu, Sichuan, China.
West China School of Medicine, Sichuan University, Chengdu, Sichuan, China.
J Int Med Res. 2024 Dec;52(12):3000605241302304. doi: 10.1177/03000605241302304.
This systematic review aimed to provide a comprehensive overview of the application of machine learning (ML) in predicting multiple adverse drug events (ADEs) using electronic health record (EHR) data.
Systematic searches were conducted using PubMed, Web of Science, Embase, and IEEE Xplore from database inception until 21 November 2023. Studies that developed ML models for predicting multiple ADEs based on EHR data were included.
Ten studies met the inclusion criteria. Twenty ML methods were reported, most commonly random forest (RF, n = 9), followed by AdaBoost (n = 4), eXtreme Gradient Boosting (n = 3), and support vector machine (n = 3). The mean area under the summary receiver operator characteristics curve (AUC) was 0.76 (95% confidence interval [CI] = 0.26-0.95). RF combined with resampling-based approaches achieved high AUCs (0.9448-0.9457). The common risk factors of ADEs included the length of hospital stay, number of prescribed drugs, and admission type. The pooled estimated AUC was 0.72 (95% CI = 0.68-0.75).
Future studies should adhere to more rigorous reporting standards and consider new ML methods to facilitate the application of ML models in clinical practice.
本系统评价旨在全面概述机器学习(ML)在利用电子健康记录(EHR)数据预测多种药物不良事件(ADEs)中的应用。
从数据库建立至2023年11月21日,使用PubMed、科学网、Embase和IEEE Xplore进行系统检索。纳入基于EHR数据开发用于预测多种ADEs的ML模型的研究。
10项研究符合纳入标准。报告了20种ML方法,最常用的是随机森林(RF,n = 9),其次是AdaBoost(n = 4)、极端梯度提升(n = 3)和支持向量机(n = 3)。汇总接受者操作特征曲线(AUC)下的平均面积为0.76(95%置信区间[CI]=0.26 - 0.95)。RF与基于重采样的方法相结合可实现较高的AUC(0.9448 - 0.9457)。ADEs的常见风险因素包括住院时间、处方药数量和入院类型。合并估计的AUC为0.72(95% CI = 0.68 - 0.75)。
未来研究应遵循更严格的报告标准,并考虑新的ML方法,以促进ML模型在临床实践中的应用。