Institute of Parasitology, McGill University, Montreal, QC, H9X 3V9, Canada.
Princess Margaret Cancer Centre, University Health Network, Toronto, ON, M5G 1L7, Canada.
Sci Rep. 2018 Jun 14;8(1):9110. doi: 10.1038/s41598-018-27495-x.
High-throughput screening (HTS) performs the experimental testing of a large number of chemical compounds aiming to identify those active in the considered assay. Alternatively, faster and cheaper methods of large-scale virtual screening are performed computationally through quantitative structure-activity relationship (QSAR) models. However, the vast amount of available HTS heterogeneous data and the imbalanced ratio of active to inactive compounds in an assay make this a challenging problem. Although different QSAR models have been proposed, they have certain limitations, e.g., high false positive rates, complicated user interface, and limited utilization options. Therefore, we developed DPubChem, a novel web tool for deriving QSAR models that implement the state-of-the-art machine-learning techniques to enhance the precision of the models and enable efficient analyses of experiments from PubChem BioAssay database. DPubChem also has a simple interface that provides various options to users. DPubChem predicted active compounds for 300 datasets with an average geometric mean and F score of 76.68% and 76.53%, respectively. Furthermore, DPubChem builds interaction networks that highlight novel predicted links between chemical compounds and biological assays. Using such a network, DPubChem successfully suggested a novel drug for the Niemann-Pick type C disease. DPubChem is freely available at www.cbrc.kaust.edu.sa/dpubchem .
高通量筛选 (HTS) 对大量化合物进行实验测试,旨在鉴定在特定测定中具有活性的化合物。或者,可以通过定量构效关系 (QSAR) 模型在计算上进行更快、更便宜的大规模虚拟筛选方法。然而,可用的高通量筛选异构数据量巨大,以及测定中活性与非活性化合物的比例不平衡,这使得这成为一个具有挑战性的问题。尽管已经提出了不同的 QSAR 模型,但它们存在一定的局限性,例如,高假阳性率、复杂的用户界面和有限的利用选项。因此,我们开发了 DPubChem,这是一种用于推导 QSAR 模型的新型网络工具,它实现了最先进的机器学习技术,以提高模型的精度,并能够有效地分析来自 PubChem 生物测定数据库的实验。DPubChem 还具有简单的界面,为用户提供了各种选项。DPubChem 对 300 个数据集进行了预测,平均几何平均值和 F 分数分别为 76.68%和 76.53%。此外,DPubChem 构建了交互网络,突出了化合物和生物测定之间的新预测联系。通过使用这样的网络,DPubChem 成功地为尼曼-皮克 C 型疾病推荐了一种新型药物。DPubChem 可在 www.cbrc.kaust.edu.sa/dpubchem 免费获得。