Theobald Douglas L, Wuttke Deborah S
Department of Biochemistry, Brandeis University, Waltham, Massachusetts, USA.
PLoS Comput Biol. 2008 Feb;4(2):e43. doi: 10.1371/journal.pcbi.0040043.
The cores of globular proteins are densely packed, resulting in complicated networks of structural interactions. These interactions in turn give rise to dynamic structural correlations over a wide range of time scales. Accurate analysis of these complex correlations is crucial for understanding biomolecular mechanisms and for relating structure to function. Here we report a highly accurate technique for inferring the major modes of structural correlation in macromolecules using likelihood-based statistical analysis of sets of structures. This method is generally applicable to any ensemble of related molecules, including families of nuclear magnetic resonance (NMR) models, different crystal forms of a protein, and structural alignments of homologous proteins, as well as molecular dynamics trajectories. Dominant modes of structural correlation are determined using principal components analysis (PCA) of the maximum likelihood estimate of the correlation matrix. The correlations we identify are inherently independent of the statistical uncertainty and dynamic heterogeneity associated with the structural coordinates. We additionally present an easily interpretable method ("PCA plots") for displaying these positional correlations by color-coding them onto a macromolecular structure. Maximum likelihood PCA of structural superpositions, and the structural PCA plots that illustrate the results, will facilitate the accurate determination of dynamic structural correlations analyzed in diverse fields of structural biology.
球状蛋白质的核心区域堆积紧密,形成了复杂的结构相互作用网络。这些相互作用进而在广泛的时间尺度上产生动态结构关联。准确分析这些复杂关联对于理解生物分子机制以及将结构与功能联系起来至关重要。在此,我们报告一种高度精确的技术,该技术利用基于似然性的结构集统计分析来推断大分子中结构关联的主要模式。此方法普遍适用于任何相关分子的集合,包括核磁共振(NMR)模型家族、蛋白质的不同晶体形式、同源蛋白质的结构比对以及分子动力学轨迹。利用相关矩阵的最大似然估计的主成分分析(PCA)来确定结构关联的主导模式。我们所识别的关联本质上独立于与结构坐标相关的统计不确定性和动态异质性。我们还提出了一种易于解释的方法(“PCA图”),通过将位置关联以颜色编码的方式显示在大分子结构上来展示这些关联。结构叠加的最大似然PCA以及说明结果的结构PCA图,将有助于准确确定在结构生物学不同领域中分析的动态结构关联。