Jiang Huaipan, Fan Mengran, Wang Jian, Sarma Anup, Mohanty Shruti, Dokholyan Nikolay V, Mahdavi Mehrdad, Kandemir Mahmut T
Department of Computer Science and Engineering, Pennsylvania State University, State College 16802, United States.
Departments of Pharmacology and Biochemistry and Molecular Biology, Pennsylvania State College of Medicine, Hershey 17033, United States.
J Chem Inf Model. 2020 Oct 26;60(10):4594-4602. doi: 10.1021/acs.jcim.0c00542. Epub 2020 Oct 14.
The high-performance computational techniques have brought significant benefits for drug discovery efforts in recent decades. One of the most challenging problems in drug discovery is the protein-ligand binding pose prediction. To predict the most stable structure of the complex, the performance of conventional structure-based molecular docking methods heavily depends on the accuracy of scoring or energy functions (as an approximation of affinity) for each pose of the protein-ligand docking complex to effectively guide the search in an exponentially large solution space. However, due to the heterogeneity of molecular structures, the existing scoring calculation methods are either tailored to a particular data set or fail to exhibit high accuracy. In this paper, we propose a convolutional neural network (CNN)-based model that learns to predict the stability factor of the protein-ligand complex and exhibits the ability of CNNs to improve the existing docking software. Evaluated results on PDBbind data set indicate that our approach reduces the execution time of the traditional docking-based method while improving the accuracy. Our code, experiment scripts, and pretrained models are available at https://github.com/j9650/MedusaNet.
近几十年来,高性能计算技术给药物研发工作带来了显著益处。药物研发中最具挑战性的问题之一是蛋白质-配体结合姿态预测。为预测复合物的最稳定结构,传统的基于结构的分子对接方法的性能在很大程度上取决于对蛋白质-配体对接复合物每个姿态的评分或能量函数(作为亲和力的近似值)的准确性,以便在指数级大的解空间中有效地指导搜索。然而,由于分子结构的异质性,现有的评分计算方法要么是针对特定数据集定制的,要么无法展现出高精度。在本文中,我们提出了一种基于卷积神经网络(CNN)的模型,该模型能够学习预测蛋白质-配体复合物的稳定性因子,并展现出卷积神经网络改进现有对接软件的能力。在PDBbind数据集上的评估结果表明,我们的方法在提高准确性的同时,减少了基于传统对接方法的执行时间。我们的代码、实验脚本和预训练模型可在https://github.com/j9650/MedusaNet获取。