Brutlag Douglas, Apaydin Serkan, Guestrin Carlos, Hsu David, Varma Chris, Singh Amit, Latombe Jean-Claude
Department of Biochemistry, Stanford University, California, USA.
Bioinformatics. 2002;18 Suppl 2:S74. doi: 10.1093/bioinformatics/18.suppl_2.s74.
The problems of protein folding and ligand docking have been explored largely using molecular dynamics or Monte Carlo methods. These methods are very compute intensive because they often explore a much wider range of energies, conformations and time than necessary. In addition, Monte Carlo methods often get trapped in local minima. We initially showed that robotic motion planning permitted one to determine the energy of binding and dissociation of ligands from protein binding sites (Singh et al., 1999). The robotic motion planning method maps complicated three-dimensional conformational states into a much simpler, but higher dimensional space in which conformational rearrangements can be represented as linear paths. The dimensionality of the conformation space is of the same order as the number of degrees of conformational freedom in three-dimensional space. We were able to determine the relative energy of association and dissociation of a ligand to a protein by calculating the energetics of interaction for a few thousand conformational states in the vicinity of the protein and choosing the best path from the roadmap. More recently, we have applied roadmap planning to the problem of protein folding (Apaydin et al., 2002a). We represented multiple conformations of a protein as nodes in a compact graph with the edges representing the probability of moving between neighboring states. Instead of using Monte Carlo simulation to simulate thousands of possible paths through various conformational states, we were able to use Markov methods to calculate the steady state occupancy of each conformation, needing to calculate the energy of each conformation only once. We referred to this Markov method of representing multiple conformations and transitions as stochastic roadmap simulation or SRS. We demonstrated that the distribution of conformational states calculated with exhaustive Monte Carlo simulations asymptotically approached the Markov steady state if the same Boltzman energy distribution was used in both methods. SRS permits one to calculate contributions from all possible paths simultaneously with far fewer energy calculations than Monte Carlo or molecular dynamics methods. The SRS method also permits one to represent multiple unfolded starting states and multiple, near-native, folded states and all possible paths between them simultaneously. The SRS method is also independent of the function used to calculate the energy of the various conformational states. In a paper to be presented at this conference (Apaydin et al., 2002b) we have also applied SRS to ligand docking in which we calculate the dynamics of ligand-protein association and dissociation in the region of various binding sites on a number of proteins. SRS permits us to determine the relative times of association to and dissociation from various catalytic and non-catalytic binding sites on protein surfaces. Instead of just following the best path in a roadmap, we can calculate the contribution of all the possible binding or dissociation paths and their relative probabilities and energies simultaneously.
蛋白质折叠和配体对接问题主要通过分子动力学或蒙特卡罗方法进行探索。这些方法计算量极大,因为它们常常探索的能量、构象和时间范围比所需的要广泛得多。此外,蒙特卡罗方法常常陷入局部最小值。我们最初表明,机器人运动规划能够确定配体与蛋白质结合位点的结合和解离能量(辛格等人,1999年)。机器人运动规划方法将复杂的三维构象状态映射到一个简单得多但维度更高的空间,其中构象重排可表示为线性路径。构象空间的维度与三维空间中构象自由度的数量处于同一量级。我们通过计算蛋白质附近数千个构象状态的相互作用能量,并从路线图中选择最佳路径,能够确定配体与蛋白质结合和解离的相对能量。最近,我们已将路线图规划应用于蛋白质折叠问题(阿帕伊丁等人,2002a)。我们将蛋白质的多种构象表示为一个紧凑图中的节点,边表示相邻状态之间移动的概率。我们能够使用马尔可夫方法计算每个构象的稳态占有率,而不是使用蒙特卡罗模拟来模拟通过各种构象状态的数千条可能路径,这样每个构象的能量只需计算一次。我们将这种表示多种构象和转变的马尔可夫方法称为随机路线图模拟或SRS。我们证明,如果在两种方法中使用相同的玻尔兹曼能量分布,通过详尽的蒙特卡罗模拟计算得到的构象状态分布渐近地接近马尔可夫稳态。SRS允许人们用比蒙特卡罗或分子动力学方法少得多的能量计算同时计算所有可能路径的贡献。SRS方法还允许人们同时表示多个未折叠起始状态、多个接近天然的折叠状态以及它们之间的所有可能路径。SRS方法也独立于用于计算各种构象状态能量的函数。在本次会议上即将发表的一篇论文中(阿帕伊丁等人,2002b),我们还将SRS应用于配体对接,在其中我们计算了配体与多种蛋白质上不同结合位点区域的结合和解离动力学。SRS使我们能够确定配体与蛋白质表面各种催化和非催化结合位点结合和解离的相对时间。我们不仅可以遵循路线图中的最佳路径,还可以同时计算所有可能的结合或解离路径的贡献及其相对概率和能量。