Zhao Haochen, Li Dingxi, Zhong Jian, Liang Xiao, Duan Guihua, Wang Jianxin
Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China.
Bioinformatics. 2025 Jun 2;41(6). doi: 10.1093/bioinformatics/btaf319.
Drug side effects refer to harmful or adverse reactions that occur during drug use, unrelated to the therapeutic purpose. A core issue in drug side effect prediction is determining the frequency of these drug side effects in the population, which can guide patient medication use and drug development. Many computational methods have been developed to predict the frequency of drug side effects as an alternative to clinical trials. However, existing methods typically build regression models on five frequency classes of drug side effects and tend to overfit the training set, leading to boundary handling issues and the risk of overfitting.
To address this problem, we develop a multi-source similarity fusion-based model, named multi-source similarity fusion (MSSF), for predicting five frequency classes of drug side effects. Compared to existing methods, our model utilizes the multi-source feature fusion module and the self-attention mechanism to explore the relationships between drugs and side effects deeply and employs Bayesian variational inference to more accurately predict the frequency classes of drug side effects. The experimental results indicate that MSSF consistently achieves superior performance compared to existing models across multiple evaluation settings, including cross-validation, cold-start experiments, and independent testing. The visual analysis and case studies further demonstrate MSSF's reliable feature extraction capability and promise in predicting the frequency classes of drug side effects.
The source code of MSSF is available on GitHub (https://github.com/dingxlcse/MSSF.git) and archived on Zenodo (DOI: 10.5281/zenodo.15462041).
药物副作用是指在用药过程中出现的与治疗目的无关的有害或不良反应。药物副作用预测的一个核心问题是确定这些药物副作用在人群中的发生频率,这可以指导患者用药和药物研发。已经开发了许多计算方法来预测药物副作用的发生频率,以替代临床试验。然而,现有方法通常基于药物副作用的五个频率类别构建回归模型,并且往往会过度拟合训练集,导致边界处理问题和过度拟合风险。
为了解决这个问题,我们开发了一种基于多源相似性融合的模型,名为多源相似性融合(MSSF),用于预测药物副作用的五个频率类别。与现有方法相比,我们的模型利用多源特征融合模块和自注意力机制来深入探索药物与副作用之间的关系,并采用贝叶斯变分推理来更准确地预测药物副作用的频率类别。实验结果表明,在包括交叉验证、冷启动实验和独立测试在内的多个评估设置中,MSSF与现有模型相比始终具有卓越的性能。可视化分析和案例研究进一步证明了MSSF在预测药物副作用频率类别方面具有可靠的特征提取能力和前景。
MSSF的源代码可在GitHub(https://github.com/dingxlcse/MSSF.git)上获取,并已存档于Zenodo(DOI:10.5281/zenodo.15462041)。