LaBute Montiago X, Zhang Xiaohua, Lenderman Jason, Bennion Brian J, Wong Sergio E, Lightstone Felice C
Computational Engineering Division, Lawrence Livermore National Laboratory, Livermore, California, United States of America.
Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, California, United States of America.
PLoS One. 2014 Sep 5;9(9):e106298. doi: 10.1371/journal.pone.0106298. eCollection 2014.
Late-stage or post-market identification of adverse drug reactions (ADRs) is a significant public health issue and a source of major economic liability for drug development. Thus, reliable in silico screening of drug candidates for possible ADRs would be advantageous. In this work, we introduce a computational approach that predicts ADRs by combining the results of molecular docking and leverages known ADR information from DrugBank and SIDER. We employed a recently parallelized version of AutoDock Vina (VinaLC) to dock 906 small molecule drugs to a virtual panel of 409 DrugBank protein targets. L1-regularized logistic regression models were trained on the resulting docking scores of a 560 compound subset from the initial 906 compounds to predict 85 side effects, grouped into 10 ADR phenotype groups. Only 21% (87 out of 409) of the drug-protein binding features involve known targets of the drug subset, providing a significant probe of off-target effects. As a control, associations of this drug subset with the 555 annotated targets of these compounds, as reported in DrugBank, were used as features to train a separate group of models. The Vina off-target models and the DrugBank on-target models yielded comparable median area-under-the-receiver-operating-characteristic-curves (AUCs) during 10-fold cross-validation (0.60-0.69 and 0.61-0.74, respectively). Evidence was found in the PubMed literature to support several putative ADR-protein associations identified by our analysis. Among them, several associations between neoplasm-related ADRs and known tumor suppressor and tumor invasiveness marker proteins were found. A dual role for interstitial collagenase in both neoplasms and aneurysm formation was also identified. These associations all involve off-target proteins and could not have been found using available drug/on-target interaction data. This study illustrates a path forward to comprehensive ADR virtual screening that can potentially scale with increasing number of CPUs to tens of thousands of protein targets and millions of potential drug candidates.
药物不良反应(ADR)的晚期或上市后识别是一个重大的公共卫生问题,也是药物研发中主要经济责任的来源。因此,对候选药物进行可靠的计算机模拟筛选以发现可能的ADR将具有优势。在这项工作中,我们引入了一种计算方法,该方法通过结合分子对接结果并利用来自DrugBank和SIDER的已知ADR信息来预测ADR。我们使用了最近并行化的AutoDock Vina版本(VinaLC)将906种小分子药物与409个DrugBank蛋白质靶点的虚拟面板进行对接。对来自最初906种化合物的560种化合物子集的对接得分进行L1正则化逻辑回归模型训练,以预测85种副作用,分为10个ADR表型组。药物 - 蛋白质结合特征中只有21%(409个中的87个)涉及药物子集的已知靶点,这为脱靶效应提供了重要的探索。作为对照,将该药物子集与DrugBank中报道的这些化合物的555个注释靶点的关联用作特征来训练另一组模型。在10倍交叉验证期间,Vina脱靶模型和DrugBank靶上模型产生了相当的中位受试者操作特征曲线下面积(AUC)(分别为0.60 - 0.69和0.61 - 0.74)。在PubMed文献中发现了证据支持我们分析中确定的几种假定的ADR - 蛋白质关联。其中,发现了几种与肿瘤相关的ADR与已知肿瘤抑制因子和肿瘤侵袭性标记蛋白之间的关联。还确定了间质胶原酶在肿瘤和动脉瘤形成中的双重作用。这些关联都涉及脱靶蛋白,使用现有的药物/靶上相互作用数据无法发现。这项研究说明了全面的ADR虚拟筛选的前进道路,该筛选可能随着CPU数量的增加扩展到数万个蛋白质靶点和数百万个潜在的候选药物。