Suppr超能文献

利用转化后的氨基酸残基数据评估蛋白质对和蛋白质组的相似性。

Assessment of similarities of pairs and groups of proteins using transformed amino-acid-residue data.

作者信息

Reisner A H, Westwood N H

出版信息

J Mol Evol. 1982;18(4):240-50. doi: 10.1007/BF01734102.

Abstract

Using as a primary standard a representative set of 208 proteins whose amino-acid-residue mole frequencies have been accurately established, a set of standard distributions of mole frequencies is defined for each amino acids, in terms of which percentile values for the observed mole frequencies of the amino-acid residues in any other protein can be determined. Data so transformed have a distribution much closer to Gaussian than untransformed values, and allow meaningful determinations of correlations between the amino-acid-residue compositions of two proteins as well as between pairs of amino-acid-residues within groups of proteins. Of the 153 possible pairs of amino acids (Asx and Glx are used) 39 are significantly correlated at p less than or equal to 0.01 and 22 at p less than or equal to 0.001. A percentile table is included for those wishing to use the method with programmable calculators. The transformed data for amino-acid compositions have been used to perform principal components analyses on groups of proteins in order to determine if meaningful sub-groupings (observable clusters in scatter diagrams) were detectable. Such analyses are shown for the representative set of proteins and for a group of 184 globins. With regard to the globin chains, a correlation is observed for alpha chains in the first principal component projection (PCP), (accounting for 22% of the variance) with respect to the evolutionary time-scale while beta chains show such a correlation in the first and second PCPs (22% and 18% of the variance respectively). Thus, alpha and beta chains appear to diverge from a common progenitor, similar in position to globin chains from "primitive" forms. Furthermore, globins from "primitive" forms are nearer to one another than they are to globins from the vertebrates, a finding without a priori reason, suggesting perhaps that once a chain has reached a stable relationship with its environment, strong constrains are placed on the co-existing globin chains so that they maintain appropriate interaction with one another. In addition, positions of the epsilon, gamma and delta chains are in the order: epsilon (embryonal) more primitive than gamma (foetal) more primitive than delta equal to beta (adult).

摘要

使用一组具有代表性的208种蛋白质作为主要标准,这些蛋白质的氨基酸残基摩尔频率已被精确确定,为每种氨基酸定义了一组摩尔频率的标准分布,据此可以确定任何其他蛋白质中氨基酸残基观察到的摩尔频率的百分位数。如此转换后的数据比未转换的值具有更接近高斯分布的分布,并且可以对两种蛋白质的氨基酸残基组成之间以及蛋白质组内的氨基酸残基对之间进行有意义的相关性测定。在153种可能的氨基酸对(使用Asx和Glx)中,39对在p小于或等于0.01时具有显著相关性,22对在p小于或等于0.001时具有显著相关性。为希望使用该方法的可编程计算器用户提供了一个百分位数表。氨基酸组成的转换数据已用于对蛋白质组进行主成分分析,以确定是否可以检测到有意义的亚分组(散点图中可观察到的聚类)。对代表性蛋白质组和一组184种球蛋白进行了此类分析。关于球蛋白链,在第一个主成分投影(PCP)中观察到α链的相关性(占方差的22%),相对于进化时间尺度,而β链在第一和第二个PCP中显示出这种相关性(分别占方差的22%和18%)。因此,α链和β链似乎从一个共同的祖先分化而来,其位置与“原始”形式的球蛋白链相似。此外,“原始”形式的球蛋白彼此之间比与脊椎动物的球蛋白更接近,这一发现没有先验原因,这可能表明一旦一条链与其环境达到稳定关系,就会对共存球蛋白链施加强烈约束,以便它们相互保持适当的相互作用。此外,ε、γ和δ链的位置顺序为:ε(胚胎)比γ(胎儿)更原始,γ比δ(成人)更原始,δ与β(成人)相等。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验