Department of Biochemistry & Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA.
Nucleic Acids Res. 2018 Sep 19;46(16):8143-8152. doi: 10.1093/nar/gky604.
Conformational ensembles of biopolymers, whether proteins or chromosomes, can be described using contact matrices. Principal component analysis (PCA) on the contact data has been used to interrogate both protein and chromosome structures and/or dynamics. However, as these fields have developed separately, variants of PCA have emerged. Previously, a variant we hereby term Implicit-PCA (I-PCA) has been applied to chromosome contact matrices and revealed the spatial segregation of active and inactive chromatin. Separately, Explicit-PCA (E-PCA) has previously been applied to proteins and characterized their correlated structure fluctuations. Here, we swapped analysis methods (I-PCA and E-PCA), applying each to a different biopolymer type (chromosome or protein) than the one for which they were initially developed. We find that applying E-PCA to chromosome distance matrices derived from microscopy data can reveal the dominant motion (concerted fluctuation) of these chromosomes. Further, by applying E-PCA to Hi-C data across the human blood cell lineage, we isolated the aspects of chromosome structure that most strongly differentiate cell types. Conversely, when we applied I-PCA to simulation snapshots of proteins, the major component reported the consensus features of the structure, making this a promising approach for future analysis of semi-structured proteins.
生物聚合物(无论是蛋白质还是染色体)的构象集合可以使用接触矩阵来描述。基于接触数据的主成分分析(PCA)已被用于研究蛋白质和染色体的结构和/或动力学。然而,由于这些领域是分别发展的,因此出现了 PCA 的变体。此前,我们在此处称为隐式 PCA(I-PCA)的变体已应用于染色体接触矩阵,并揭示了活性和非活性染色质的空间分离。另外,显式 PCA(E-PCA)以前曾应用于蛋白质,并描述了它们相关的结构波动。在这里,我们交换了分析方法(I-PCA 和 E-PCA),将每种方法应用于与最初开发的方法不同的生物聚合物类型(染色体或蛋白质)。我们发现,将 E-PCA 应用于显微镜数据得出的染色体距离矩阵可以揭示这些染色体的主要运动(协同波动)。此外,通过将 E-PCA 应用于人类血细胞谱系中的 Hi-C 数据,我们分离出了最能区分细胞类型的染色体结构方面。相反,当我们将 I-PCA 应用于蛋白质的模拟快照时,主要成分报告了结构的共识特征,这为未来对半结构化蛋白质的分析提供了一种有前途的方法。