Jordan I King, Kondrashov Fyodor A, Adzhubei Ivan A, Wolf Yuri I, Koonin Eugene V, Kondrashov Alexey S, Sunyaev Shamil
National Center for Biotechnology Information, NIH, Bethesda, Maryland 20894, USA.
Nature. 2005 Feb 10;433(7026):633-8. doi: 10.1038/nature03306. Epub 2005 Jan 19.
Amino acid composition of proteins varies substantially between taxa and, thus, can evolve. For example, proteins from organisms with (G + C)-rich (or (A + T)-rich) genomes contain more (or fewer) amino acids encoded by (G + C)-rich codons. However, no universal trends in ongoing changes of amino acid frequencies have been reported. We compared sets of orthologous proteins encoded by triplets of closely related genomes from 15 taxa representing all three domains of life (Bacteria, Archaea and Eukaryota), and used phylogenies to polarize amino acid substitutions. Cys, Met, His, Ser and Phe accrue in at least 14 taxa, whereas Pro, Ala, Glu and Gly are consistently lost. The same nine amino acids are currently accrued or lost in human proteins, as shown by analysis of non-synonymous single-nucleotide polymorphisms. All amino acids with declining frequencies are thought to be among the first incorporated into the genetic code; conversely, all amino acids with increasing frequencies, except Ser, were probably recruited late. Thus, expansion of initially under-represented amino acids, which began over 3,400 million years ago, apparently continues to this day.
不同分类群之间蛋白质的氨基酸组成差异很大,因此是可以进化的。例如,来自基因组富含(G + C)(或富含(A + T))的生物体的蛋白质含有更多(或更少)由富含(G + C)的密码子编码的氨基酸。然而,尚未报道氨基酸频率持续变化的普遍趋势。我们比较了代表生命所有三个域(细菌、古菌和真核生物)的15个分类群的密切相关基因组三联体编码的直系同源蛋白质组,并利用系统发育树对氨基酸替换进行极化分析。半胱氨酸、甲硫氨酸、组氨酸、丝氨酸和苯丙氨酸在至少14个分类群中增加,而脯氨酸、丙氨酸、谷氨酸和甘氨酸则持续减少。通过对非同义单核苷酸多态性的分析表明,目前人类蛋白质中也有相同的9种氨基酸在增加或减少。所有频率下降的氨基酸被认为是最早纳入遗传密码的氨基酸之一;相反,除丝氨酸外所有频率增加的氨基酸可能是后来才被纳入的。因此,始于34亿多年前的最初代表性不足的氨基酸的扩展显然一直持续到今天。