Wang Jian, Dokholyan Nikolay V
bioRxiv. 2024 Oct 14:2024.10.08.617219. doi: 10.1101/2024.10.08.617219.
A complex web of intermolecular interactions defines and regulates biological processes. Understanding this web has been particularly challenging because of the sheer number of actors in biological systems: ∼10 proteins in a typical human cell offer a plausible 10 interactions. This number grows rapidly if we consider metabolites, drugs, nutrients, and other biological molecules. The relative strength of interactions also critically affects these biological processes. However, the small and often incomplete datasets (10 -10 protein-ligand interactions) traditionally used for binding affinity predictions limit the ability to capture the full complexity of these interactions. To overcome this challenge, we developed Yuel 2, a novel neural network-based approach that leverages transfer learning to address the limitations of small datasets. Yuel 2 is pre-trained on a large-scale dataset to learn intricate structural features and then fine-tuned on specialized datasets like PDBbind to enhance the predictive accuracy and robustness. We show that Yuel 2 predicts multiple binding affinity metrics - Kd, Ki, and IC50 - between proteins and small molecules, offering a comprehensive representation of molecular interactions crucial for drug design and development.
分子间相互作用的复杂网络定义并调节着生物过程。由于生物系统中涉及的分子数量众多,理解这一网络极具挑战性:典型的人类细胞中约有10种蛋白质,理论上可能存在10种相互作用。如果将代谢物、药物、营养物质和其他生物分子考虑在内,这个数字会迅速增加。相互作用的相对强度也对这些生物过程有着至关重要的影响。然而,传统上用于结合亲和力预测的数据集往往较小且不完整(10 - 10种蛋白质 - 配体相互作用),这限制了我们全面捕捉这些相互作用复杂性的能力。为了克服这一挑战,我们开发了Yuel 2,这是一种基于神经网络的新方法,利用迁移学习来解决小数据集的局限性。Yuel 2在大规模数据集上进行预训练,以学习复杂的结构特征,然后在像PDBbind这样的专门数据集上进行微调,以提高预测准确性和鲁棒性。我们表明,Yuel 2能够预测蛋白质与小分子之间的多种结合亲和力指标——解离常数(Kd)、抑制常数(Ki)和半数抑制浓度(IC50),为药物设计和开发中至关重要的分子相互作用提供了全面的表征。