Kinjo Akira R, Nakamura Haruki
Institute for Protein Research, Osaka University, Suita, Osaka, Japan.
PLoS One. 2008 Apr 9;3(4):e1963. doi: 10.1371/journal.pone.0001963.
Position-specific scoring matrices (PSSMs) are useful for detecting weak homology in protein sequence analysis, and they are thought to contain some essential signatures of the protein families. In order to elucidate what kind of ingredients constitute such family-specific signatures, we apply singular value decomposition to a set of PSSMs and examine the properties of dominant right and left singular vectors. The first right singular vectors were correlated with various amino acid indices including relative mutability, amino acid composition in protein interior, hydropathy, or turn propensity, depending on proteins. A significant correlation between the first left singular vector and a measure of site conservation was observed. It is shown that the contribution of the first singular component to the PSSMs act to disfavor potentially but falsely functionally important residues at conserved sites. The second right singular vectors were highly correlated with hydrophobicity scales, and the corresponding left singular vectors with contact numbers of protein structures. It is suggested that sequence alignment with a PSSM is essentially equivalent to threading supplemented with functional information. In addition, singular vectors may be useful for analyzing and annotating the characteristics of conserved sites in protein families.
位置特异性得分矩阵(PSSM)在蛋白质序列分析中对于检测弱同源性很有用,并且人们认为它们包含了蛋白质家族的一些基本特征。为了阐明是何种成分构成了这种家族特异性特征,我们对一组PSSM应用奇异值分解,并研究主导右奇异向量和左奇异向量的性质。根据蛋白质的不同,第一个右奇异向量与各种氨基酸指标相关,包括相对变异性、蛋白质内部的氨基酸组成、亲水性或转角倾向。观察到第一个左奇异向量与位点保守性的一种度量之间存在显著相关性。结果表明,第一个奇异分量对PSSM的贡献在于不利于保守位点上潜在但错误地具有功能重要性的残基。第二个右奇异向量与疏水性标度高度相关,相应的左奇异向量与蛋白质结构的接触数相关。有人提出,用PSSM进行序列比对本质上等同于补充了功能信息的穿线法。此外,奇异向量可能有助于分析和注释蛋白质家族中保守位点的特征。