Galeano Diego, Li Shantao, Gerstein Mark, Paccanaro Alberto
Department of Computer Science, Centre for Systems and Synthetic Biology, Royal Holloway, University of London, Egham Hill, Egham, UK.
School of Applied Mathematics, Fundação Getulio Vargas, Rio de Janeiro, Brazil.
Nat Commun. 2020 Sep 11;11(1):4575. doi: 10.1038/s41467-020-18305-y.
A central issue in drug risk-benefit assessment is identifying frequencies of side effects in humans. Currently, frequencies are experimentally determined in randomised controlled clinical trials. We present a machine learning framework for computationally predicting frequencies of drug side effects. Our matrix decomposition algorithm learns latent signatures of drugs and side effects that are both reproducible and biologically interpretable. We show the usefulness of our approach on 759 structurally and therapeutically diverse drugs and 994 side effects from all human physiological systems. Our approach can be applied to any drug for which a small number of side effect frequencies have been identified, in order to predict the frequencies of further, yet unidentified, side effects. We show that our model is informative of the biology underlying drug activity: individual components of the drug signatures are related to the distinct anatomical categories of the drugs and to the specific drug routes of administration.
药物风险效益评估中的一个核心问题是确定人类副作用的发生率。目前,发生率是在随机对照临床试验中通过实验确定的。我们提出了一个用于计算预测药物副作用发生率的机器学习框架。我们的矩阵分解算法学习药物和副作用的潜在特征,这些特征既具有可重复性又具有生物学可解释性。我们展示了我们的方法在759种结构和治疗方式各异的药物以及来自所有人类生理系统的994种副作用上的有效性。我们的方法可以应用于任何已确定少量副作用发生率的药物,以预测更多尚未确定的副作用的发生率。我们表明,我们的模型能够提供有关药物活性背后生物学机制的信息:药物特征的各个组成部分与药物的不同解剖学类别以及特定给药途径相关。