School of Pharmacy, Sungkyunkwan University, 2066, Seobu-ro, Jangan-gu, Suwon, Gyeonggi-do, 16419, Republic of Korea.
Department of Health Administration, College of Nursing and Health, Kongju National University, Gongju, Republic of Korea.
Sci Rep. 2022 Sep 1;12(1):14869. doi: 10.1038/s41598-022-18522-z.
There has been a growing attention on using machine learning (ML) in pharmacovigilance. This study aimed to investigate the utility of supervised ML algorithms on timely detection of safety signals in the Korea Adverse Event Reporting System (KAERS), using infliximab as a case drug, between 2009 and 2018. Input data set for ML training was constructed based on the drug label information and spontaneous reports in the KAERS. Gold standard dataset containing known AEs was randomly divided into the training and test sets. Two supervised ML algorithms (gradient boosting machine [GBM], random forest [RF]) were fitted with hyperparameters tuned on the training set by using a fivefold validation. Then, we stratified the KAERS data by calendar year to create 10 cumulative yearly datasets, in which ML algorithms were applied to detect five pre-specified AEs of infliximab identified during post-marketing surveillance. Four AEs were detected by both GBM and RF in the first year they appeared in the KAERS and earlier than they were updated in the drug label of infliximab. We further applied our models to data retrieved from the US Food and Drug Administration Adverse Event Reporting System repository and found that they outperformed existing disproportionality methods. Both GBM and RF demonstrated reliable performance in detecting early safety signals and showed promise for applying such approaches to pharmacovigilance.
人们越来越关注在药物警戒中使用机器学习(ML)。本研究旨在调查监督 ML 算法在 2009 年至 2018 年期间,使用英夫利昔单抗作为案例药物,及时检测韩国不良事件报告系统(KAERS)中安全信号的效用。ML 训练的输入数据集是基于药物标签信息和 KAERS 中的自发报告构建的。包含已知 AE 的黄金标准数据集被随机分为训练集和测试集。使用五重验证法对训练集进行超参数调整后,拟合了两种监督 ML 算法(梯度提升机[GBM]、随机森林[RF])。然后,我们按日历年度对 KAERS 数据进行分层,创建了 10 个累积年度数据集,在这些数据集中,ML 算法被应用于检测在上市后监测期间发现的英夫利昔单抗的五个预先指定的 AE。GBM 和 RF 都在它们首次出现在 KAERS 中的第一年,并且早于英夫利昔单抗的药物标签更新时,就检测到了其中四个 AE。我们进一步将我们的模型应用于从美国食品和药物管理局不良事件报告系统存储库中检索到的数据,发现它们优于现有的不均衡性方法。GBM 和 RF 在检测早期安全信号方面都表现出可靠的性能,并且为将这些方法应用于药物警戒提供了希望。