Teodoro Miguel L, Phillips George N, Kavraki Lydia E
Department of Biochemistry and Cell Biology and Department of Computer Science, Rice University, 6100 Main Street, MS 140, Houston, TX 77005, USA.
J Comput Biol. 2003;10(3-4):617-34. doi: 10.1089/10665270360688228.
This work shows how to decrease the complexity of modeling flexibility in proteins by reducing the number of dimensions necessary to model important macromolecular motions such as the induced-fit process. Induced fit occurs during the binding of a protein to other proteins, nucleic acids, or small molecules (ligands) and is a critical part of protein function. It is now widely accepted that conformational changes of proteins can affect their ability to bind other molecules and that any progress in modeling protein motion and flexibility will contribute to the understanding of key biological functions. However, modeling protein flexibility has proven a very difficult task. Experimental laboratory methods, such as x-ray crystallography, produce rather limited information, while computational methods such as molecular dynamics are too slow for routine use with large systems. In this work, we show how to use the principal component analysis method, a dimensionality reduction technique, to transform the original high-dimensional representation of protein motion into a lower dimensional representation that captures the dominant modes of motions of proteins. For a medium-sized protein, this corresponds to reducing a problem with a few thousand degrees of freedom to one with less than fifty. Although there is inevitably some loss in accuracy, we show that we can obtain conformations that have been observed in laboratory experiments, starting from different initial conformations and working in a drastically reduced search space.
这项工作展示了如何通过减少对重要大分子运动(如诱导契合过程)进行建模所需的维度数量,来降低蛋白质建模灵活性的复杂性。诱导契合发生在蛋白质与其他蛋白质、核酸或小分子(配体)结合的过程中,是蛋白质功能的关键部分。现在人们普遍认为,蛋白质的构象变化会影响其与其他分子结合的能力,并且蛋白质运动和灵活性建模方面的任何进展都将有助于理解关键的生物学功能。然而,事实证明对蛋白质灵活性进行建模是一项非常困难的任务。诸如X射线晶体学等实验实验室方法提供的信息相当有限,而诸如分子动力学等计算方法对于大型系统的常规使用来说又太慢。在这项工作中,我们展示了如何使用主成分分析方法(一种降维技术),将蛋白质运动的原始高维表示转换为捕获蛋白质主要运动模式的低维表示。对于一个中等大小的蛋白质,这相当于将一个具有数千个自由度的问题简化为一个自由度少于五十个的问题。尽管不可避免地会有一些精度损失,但我们表明,从不同的初始构象开始,并在大幅缩小的搜索空间中进行操作,我们能够获得在实验室实验中观察到的构象。