Nguyen Phuong H
Institute of Physical and Theoretical Chemistry, J.W. Goethe University, Frankfurt, Germany.
Proteins. 2006 Dec 1;65(4):898-913. doi: 10.1002/prot.21185.
Employing the recently developed hierarchical nonlinear principal component analysis (NLPCA) method of Saegusa et al. (Neurocomputing 2004;61:57-70 and IEICE Trans Inf Syst 2005;E88-D:2242-2248), the complexities of the free energy landscapes of several peptides, including triglycine, hexaalanine, and the C-terminal beta-hairpin of protein G, were studied. First, the performance of this NLPCA method was compared with the standard linear principal component analysis (PCA). In particular, we compared two methods according to (1) the ability of the dimensionality reduction and (2) the efficient representation of peptide conformations in low-dimensional spaces spanned by the first few principal components. The study revealed that NLPCA reduces the dimensionality of the considered systems much better, than did PCA. For example, in order to get the similar error, which is due to representation of the original data of beta-hairpin in low dimensional space, one needs 4 and 21 principal components of NLPCA and PCA, respectively. Second, by representing the free energy landscapes of the considered systems as a function of the first two principal components obtained from PCA, we obtained the relatively well-structured free energy landscapes. In contrast, the free energy landscapes of NLPCA are much more complicated, exhibiting many states which are hidden in the PCA maps, especially in the unfolded regions. Furthermore, the study also showed that many states in the PCA maps are mixed up by several peptide conformations, while those of the NLPCA maps are more pure. This finding suggests that the NLPCA should be used to capture the essential features of the systems.
采用最近由佐久间等人开发的分层非线性主成分分析(NLPCA)方法(《神经计算》2004年;61:57 - 70以及《IEICE信息系统学报》2005年;E88 - D:2242 - 2248),研究了包括三甘氨酸、六丙氨酸以及蛋白G的C端β - 发夹在内的几种肽的自由能景观的复杂性。首先,将这种NLPCA方法的性能与标准线性主成分分析(PCA)进行了比较。具体而言,我们根据(1)降维能力和(2)在前几个主成分所跨越的低维空间中肽构象的有效表示来比较这两种方法。研究表明,NLPCA在降低所考虑系统的维度方面比PCA要好得多。例如,为了在低维空间中获得由于β - 发夹原始数据表示而产生的相似误差,分别需要NLPCA的4个主成分和PCA的21个主成分。其次,通过将所考虑系统的自由能景观表示为从PCA获得的前两个主成分的函数,我们得到了结构相对良好的自由能景观。相比之下,NLPCA的自由能景观要复杂得多,展现出许多隐藏在PCA图中的状态,特别是在未折叠区域。此外,研究还表明,PCA图中的许多状态被几种肽构象混合在一起,而NLPCA图中的状态则更纯净。这一发现表明,应该使用NLPCA来捕捉系统的基本特征。