Cao Mengqi, Chi Yanna, Yu Jinyang, Yang Yu, Meng Ruogu, Jia Jinzhu
Department of Biostatistics, School of Public Health, Peking University, Beijing, China.
Department of Genetics, Peking University Cancer Hospital and Institute, Beijing, China.
Front Pharmacol. 2025 Jun 23;16:1554650. doi: 10.3389/fphar.2025.1554650. eCollection 2025.
Drug safety has increasingly become a serious public health problem that threatens health and damages social economy. The common detection methods have the problem of high false positive rate. This study aims to introduce deep learning models into the adverse drug reaction (ADR) signal detection and compare different methods.
The data are based on adverse events collected by Center for ADR Monitoring of Guangdong. Traditional statistical methods were used for data preliminary screening. We transformed data into free text, extracted text information and made classification prediction by using the Long Short-Term Memory (LSTM) model. We compared it with the existing signal detection methods, including Logistic Regression, Random Forest, K-NearestNeighbor, and Multilayer Perceptron. The feature importance of the included variables was analyzed.
A total of 2,376 ADR signals were identified between January 2018 and December 2019, comprising 448 positive signals and 1,928 negative signals. The sensitivity of the LSTM model based on free text reached 95.16%, and the F1-score was 0.9706. The sensitivity of Logistic Regression model based on feature variables was 86.83%, and the F1-score was 0.9063. The classification results of our model demonstrate superior sensitivity and F1-score compared to traditional methods. Several important variables "Reasons for taking medication", "Serious ADR scenario 4", "Adverse reaction analysis 5", and "Dosage" had an important influence on the result.
The application of deep learning models shows potential to improve the detection performance in ADR monitoring.
药物安全日益成为一个严重的公共卫生问题,威胁着健康并损害社会经济。常见的检测方法存在假阳性率高的问题。本研究旨在将深度学习模型引入药物不良反应(ADR)信号检测并比较不同方法。
数据基于广东省药品不良反应监测中心收集的不良事件。使用传统统计方法进行数据初步筛选。我们将数据转换为自由文本,提取文本信息并使用长短期记忆(LSTM)模型进行分类预测。我们将其与现有的信号检测方法进行比较,包括逻辑回归、随机森林、K近邻和多层感知器。分析了纳入变量的特征重要性。
2018年1月至2019年12月共识别出2376个ADR信号,包括448个阳性信号和1928个阴性信号。基于自由文本的LSTM模型的灵敏度达到95.16%,F1分数为0.9706。基于特征变量的逻辑回归模型的灵敏度为86.83%,F1分数为0.9063。与传统方法相比,我们模型的分类结果显示出更高的灵敏度和F1分数。几个重要变量“用药原因”“严重ADR情景4”“不良反应分析5”和“剂量”对结果有重要影响。
深度学习模型的应用显示出改善ADR监测中检测性能的潜力。