Zheng Zheng, Ucisik Melek N, Merz Kenneth M
Department of Chemistry and the Quantum Theory Project, 2328 New Physics Building, P.O. Box 118435, University of Florida, Gainesville, Florida 32611-8435.
J Chem Theory Comput. 2013 Dec 10;9(12):5526-5538. doi: 10.1021/ct4005992.
Accurately computing the free energy for biological processes like protein folding or protein-ligand association remains a challenging problem. Both describing the complex intermolecular forces involved and sampling the requisite configuration space make understanding these processes innately difficult. Herein, we address the sampling problem using a novel methodology we term "movable type". Conceptually it can be understood by analogy with the evolution of printing and, hence, the name movable type. For example, a common approach to the study of protein-ligand complexation involves taking a database of intact drug-like molecules and exhaustively docking them into a binding pocket. This is reminiscent of early woodblock printing where each page had to be laboriously created prior to printing a book. However, printing evolved to an approach where a database of symbols (letters, numerals, etc.) was created and then assembled using a movable type system, which allowed for the creation of all possible combinations of symbols on a given page, thereby, revolutionizing the dissemination of knowledge. Our movable type (MT) method involves the identification of all atom pairs seen in protein-ligand complexes and then creating two databases: one with their associated pairwise distant dependent energies and another associated with the probability of how these pairs can combine in terms of bonds, angles, dihedrals and non-bonded interactions. Combining these two databases coupled with the principles of statistical mechanics allows us to accurately estimate binding free energies as well as the pose of a ligand in a receptor. This method, by its mathematical construction, samples all of configuration space of a selected region (the protein active site here) in one shot without resorting to brute force sampling schemes involving Monte Carlo, genetic algorithms or molecular dynamics simulations making the methodology extremely efficient. Importantly, this method explores the free energy surface eliminating the need to estimate the enthalpy and entropy components individually. Finally, low free energy structures can be obtained via a free energy minimization procedure yielding all low free energy poses on a given free energy surface. Besides revolutionizing the protein-ligand docking and scoring problem this approach can be utilized in a wide range of applications in computational biology which involve the computation of free energies for systems with extensive phase spaces including protein folding, protein-protein docking and protein design.
准确计算蛋白质折叠或蛋白质 - 配体结合等生物过程的自由能仍然是一个具有挑战性的问题。描述其中涉及的复杂分子间力以及对所需构象空间进行采样,使得理解这些过程本质上就很困难。在此,我们使用一种我们称为“活字印刷法”的新方法来解决采样问题。从概念上讲,它可以通过与印刷术的演变进行类比来理解,因此得名活字印刷法。例如,研究蛋白质 - 配体络合的一种常见方法是获取完整的类药物分子数据库,并将它们详尽地对接至一个结合口袋中。这让人联想到早期的木版印刷,在印刷一本书之前,每一页都必须费力地制作。然而,印刷术演变成了一种创建符号(字母、数字等)数据库的方法,然后使用活字印刷系统进行组装,这使得在给定页面上能够创建所有可能的符号组合,从而彻底改变了知识的传播方式。我们的活字印刷法(MT)涉及识别在蛋白质 - 配体复合物中出现的所有原子对,然后创建两个数据库:一个包含它们相关的成对距离依赖能量,另一个与这些对在键、角度、二面角和非键相互作用方面结合的概率相关。将这两个数据库与统计力学原理相结合,使我们能够准确估计结合自由能以及配体在受体中的构象。这种方法通过其数学构造,一次性对选定区域(此处为蛋白质活性位点)的所有构象空间进行采样,而无需借助涉及蒙特卡罗、遗传算法或分子动力学模拟的暴力采样方案,从而使该方法极其高效。重要的是,这种方法探索自由能表面,无需分别估计焓和熵分量。最后,通过自由能最小化程序可以获得低自由能结构,在给定的自由能表面上产生所有低自由能构象。除了彻底改变蛋白质 - 配体对接和评分问题外,这种方法还可用于计算生物学中的广泛应用,这些应用涉及对具有广泛相空间的系统(包括蛋白质折叠、蛋白质 - 蛋白质对接和蛋白质设计)的自由能计算。