Suppr超能文献

基于规则的分子指纹图谱计算蛋白质-配体结合熵。

Calculation of protein-ligand binding entropies using a rule-based molecular fingerprint.

机构信息

Department of Computer Science, California State University, Los Angeles, California.

Kravis Department of Integrated Sciences, Claremont McKenna College, Claremont, California.

出版信息

Biophys J. 2024 Sep 3;123(17):2839-2848. doi: 10.1016/j.bpj.2024.03.017. Epub 2024 Mar 13.

Abstract

The use of fast in silico prediction methods for protein-ligand binding free energies holds significant promise for the initial phases of drug development. Numerous traditional physics-based models (e.g., implicit solvent models), however, tend to either neglect or heavily approximate entropic contributions to binding due to their computational complexity. Consequently, such methods often yield imprecise assessments of binding strength. Machine learning models provide accurate predictions and can often outperform physics-based models. They, however, are often prone to overfitting, and the interpretation of their results can be difficult. Physics-guided machine learning models combine the consistency of physics-based models with the accuracy of modern data-driven algorithms. This work integrates physics-based model conformational entropies into a graph convolutional network. We introduce a new neural network architecture (a rule-based graph convolutional network) that generates molecular fingerprints according to predefined rules specifically optimized for binding free energy calculations. Our results on 100 small host-guest systems demonstrate significant improvements in convergence and preventing overfitting. We additionally demonstrate the transferability of our proposed hybrid model by training it on the aforementioned host-guest systems and then testing it on six unrelated protein-ligand systems. Our new model shows little difference in training set accuracy compared to a previous model but an order-of-magnitude improvement in test set accuracy. Finally, we show how the results of our hybrid model can be interpreted in a straightforward fashion.

摘要

快速的基于计算机的蛋白质 - 配体结合自由能预测方法在药物开发的初始阶段具有重要的应用前景。然而,由于计算复杂性,许多传统的基于物理的模型(例如,隐溶剂模型)往往忽略或严重近似结合的熵贡献。因此,这些方法通常会产生不精确的结合强度评估。机器学习模型提供了准确的预测,并且通常可以胜过基于物理的模型。然而,它们往往容易过度拟合,并且其结果的解释可能很困难。基于物理的机器学习模型将基于物理的模型的一致性与现代数据驱动算法的准确性相结合。这项工作将基于物理的模型构象熵集成到图卷积网络中。我们引入了一种新的神经网络架构(基于规则的图卷积网络),该架构根据专门针对结合自由能计算优化的预定义规则生成分子指纹。我们在 100 个小分子 - 客体系统上的结果表明,在收敛和防止过拟合方面有显著的改进。我们还通过在上述客体系统上训练我们提出的混合模型,并在六个不相关的蛋白质 - 配体系统上进行测试,展示了我们提出的混合模型的可转移性。我们的新模型在训练集准确性方面与以前的模型相比几乎没有差异,但在测试集准确性方面有一个数量级的提高。最后,我们展示了如何以一种直接的方式解释我们的混合模型的结果。

相似文献

引用本文的文献

1
Machine learning tools advance biophysics.机器学习工具推动生物物理学发展。
Biophys J. 2024 Sep 3;123(17):E1-E3. doi: 10.1016/j.bpj.2024.07.036. Epub 2024 Aug 21.

本文引用的文献

1
Graph convolutional networks: a comprehensive review.图卷积网络:全面综述。
Comput Soc Netw. 2019;6(1):11. doi: 10.1186/s40649-019-0069-y. Epub 2019 Nov 10.
6
Overview of the SAMPL6 host-guest binding affinity prediction challenge.SAMPL6 主客体结合亲和力预测挑战概述。
J Comput Aided Mol Des. 2018 Oct;32(10):937-963. doi: 10.1007/s10822-018-0170-6. Epub 2018 Nov 10.
7
Prediction of Protein Configurational Entropy (Popcoen).蛋白质构象熵的预测(Popcoen)
J Chem Theory Comput. 2018 Mar 13;14(3):1811-1819. doi: 10.1021/acs.jctc.7b01079. Epub 2018 Feb 16.
10
Predicting Binding Free Energies: Frontiers and Benchmarks.预测结合自由能:前沿和基准。
Annu Rev Biophys. 2017 May 22;46:531-558. doi: 10.1146/annurev-biophys-070816-033654. Epub 2017 Apr 7.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验