McNutt Andrew T, Francoeur Paul, Aggarwal Rishal, Masuda Tomohide, Meli Rocco, Ragoza Matthew, Sunseri Jocelyn, Koes David Ryan
Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA, USA.
Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, 500 032, India.
J Cheminform. 2021 Jun 9;13(1):43. doi: 10.1186/s13321-021-00522-2.
Molecular docking computationally predicts the conformation of a small molecule when binding to a receptor. Scoring functions are a vital piece of any molecular docking pipeline as they determine the fitness of sampled poses. Here we describe and evaluate the 1.0 release of the Gnina docking software, which utilizes an ensemble of convolutional neural networks (CNNs) as a scoring function. We also explore an array of parameter values for Gnina 1.0 to optimize docking performance and computational cost. Docking performance, as evaluated by the percentage of targets where the top pose is better than 2Å root mean square deviation (Top1), is compared to AutoDock Vina scoring when utilizing explicitly defined binding pockets or whole protein docking. GNINA, utilizing a CNN scoring function to rescore the output poses, outperforms AutoDock Vina scoring on redocking and cross-docking tasks when the binding pocket is defined (Top1 increases from 58% to 73% and from 27% to 37%, respectively) and when the whole protein defines the binding pocket (Top1 increases from 31% to 38% and from 12% to 16%, respectively). The derived ensemble of CNNs generalizes to unseen proteins and ligands and produces scores that correlate well with the root mean square deviation to the known binding pose. We provide the 1.0 version of GNINA under an open source license for use as a molecular docking tool at https://github.com/gnina/gnina .
分子对接通过计算预测小分子与受体结合时的构象。评分函数是任何分子对接流程的关键部分,因为它们决定了采样构象的适配性。在此,我们描述并评估了Gnina对接软件的1.0版本,该版本利用卷积神经网络(CNN)集成作为评分函数。我们还探索了Gnina 1.0的一系列参数值,以优化对接性能和计算成本。在使用明确界定的结合口袋或进行全蛋白对接时,将以最高构象优于2埃均方根偏差(Top1)的靶标百分比评估的对接性能与AutoDock Vina评分进行比较。当定义结合口袋时(Top1分别从58%提高到73%以及从27%提高到37%)以及当全蛋白定义结合口袋时(Top1分别从31%提高到38%以及从12%提高到16%),利用CNN评分函数对输出构象重新评分的GNINA在重新对接和交叉对接任务上的表现优于AutoDock Vina评分。所推导的CNN集成能够推广到未见的蛋白质和配体,并产生与到已知结合构象的均方根偏差高度相关的分数。我们在开源许可下提供GNINA 1.0版本,可在https://github.com/gnina/gnina用作分子对接工具。