Pierdominici-Sottile Gustavo, Palma Juliana
Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, Sáenz Peña 352, Bernal, B1876BXD, Argentina.
J Comput Chem. 2015 Mar 15;36(7):424-32. doi: 10.1002/jcc.23811. Epub 2014 Dec 16.
A comparison between different conformations of a given protein, relating both structure and dynamics, can be performed in terms of combined principal component analysis (combined-PCA). To that end, a trajectory is obtained by concatenating molecular dynamics trajectories of the individual conformations under comparison. Then, the principal components are calculated by diagonalizing the correlation matrix of the concatenated trajectory. Since the introduction of this approach in 1995 it has had a large number of applications. However, the interpretation of the eigenvectors and eigenvalues so obtained is based on intuitive foundations, because analytical expressions relating the concatenated correlation matrix with those of the individual trajectories under consideration have not been provided yet. In this article, we present such expressions for the cases of two, three, and an arbitrary number of concatenated trajectories. The formulas are simple and show what is to be expected and what is not to be expected from a combined-PCA. Their correctness and usefulness is demonstrated by discussing some representative examples. The results can be summarized in a simple sentence: the correlation matrix of a concatenated trajectory is given by the average of the individual correlation matrices plus the correlation matrix of the individual averages. From this it follows that the combined-PCA of trajectories belonging to different free energy basins provides information that could also be obtained by alternative and more straightforward means.
可以根据组合主成分分析(combined - PCA)对给定蛋白质的不同构象(涉及结构和动力学)进行比较。为此,通过拼接所比较的各个构象的分子动力学轨迹来获得一条轨迹。然后,通过对拼接轨迹的相关矩阵进行对角化来计算主成分。自1995年引入这种方法以来,它已经有大量应用。然而,如此获得的特征向量和特征值的解释是基于直观基础的,因为尚未提供将拼接相关矩阵与所考虑的各个轨迹的相关矩阵联系起来的解析表达式。在本文中,我们给出了两条、三条和任意数量拼接轨迹情况下的此类表达式。这些公式很简单,展示了组合主成分分析中可以预期和不可以预期的情况。通过讨论一些有代表性的例子证明了它们的正确性和实用性。结果可以用一句话概括:拼接轨迹的相关矩阵由各个相关矩阵的平均值加上各个平均值的相关矩阵给出。由此可知,属于不同自由能盆地的轨迹的组合主成分分析所提供的信息也可以通过其他更直接的方法获得。