Department of Chemistry, University of Vermont, Burlington, VT, 05405, USA.
Department of Industrial and Physical Pharmacy, Department of Chemistry, Purdue University, West Lafayette, IN, 47906, USA.
J Comput Aided Mol Des. 2023 Mar;37(3):147-156. doi: 10.1007/s10822-023-00497-2. Epub 2023 Feb 25.
Molecules with bioactivity towards G protein-coupled receptors represent a subset of the vast space of small drug-like molecules. Here, we compare machine learning models, including dilated graph convolutional networks, that conduct binary classification to quickly identify molecules with activity towards G protein-coupled receptors. The models are trained and validated using a large set of over 600,000 active, inactive, and decoy compounds. The best performing machine learning model, dubbed GPCRLigNet, was a surprisingly simple feedforward dense neural network mapping from Morgan fingerprints to activity. Incorporation of GPCRLigNet into a high-throughput virtual screening workflow is demonstrated with molecular docking towards a particular G protein-coupled receptor, the pituitary adenylate cyclase-activating polypeptide receptor type 1. Through rigorous comparison of docking scores for molecules selected with and without using GPCRLigNet, we demonstrate an enrichment of potentially potent molecules using GPCRLigNet. This work provides a proof of principle that GPCRLigNet can effectively hone the chemical search space towards ligands with G protein-coupled receptor activity.
具有针对 G 蛋白偶联受体的生物活性的分子代表了大量类似药物小分子的一个子集。在这里,我们比较了机器学习模型,包括扩张图卷积网络,它们进行二进制分类,以快速识别针对 G 蛋白偶联受体具有活性的分子。该模型使用超过 60 万个活性、非活性和诱饵化合物的大型数据集进行训练和验证。表现最好的机器学习模型被称为 GPCRLigNet,它是一种从 Morgan 指纹映射到活性的简单前馈密集神经网络。通过将 GPCRLigNet 纳入针对特定 G 蛋白偶联受体——垂体腺苷酸环化酶激活多肽受体 1 的高通量虚拟筛选工作流程,证明了这一点。通过对使用和不使用 GPCRLigNet 选择的分子的对接评分进行严格比较,我们证明了使用 GPCRLigNet 可以富集潜在有效的分子。这项工作提供了一个原理证明,即 GPCRLigNet 可以有效地针对具有 G 蛋白偶联受体活性的配体优化化学搜索空间。