Sobolevsky Yehoshua, Trifonov Edward N
Genome Diversity Center, Institute of Evolution, University of Haifa, Haifa 31905, Israel.
J Mol Evol. 2006 Nov;63(5):622-34. doi: 10.1007/s00239-005-0190-4. Epub 2006 Oct 29.
Universal scale of the sequence conservation has been recently introduced based on omnipresence of the protein sequence motifs across species. A large spectrum of short sequences, up to eight residues has been found to reside in all or almost all prokaryotic organisms. By this discovery a principally novel quantitative approach is introduced to the problem of reconstruction of the last universal common ancestor (LUCA). The most conserved elements (protein modules) with defined structures and sequences harboring the omnipresent motifs are outlined in this work, by combining the sequence and protein crystal structure data. The structurally conserved modules involve 25-30 amino acid residues and have appearance of closed loops, loop-n-lock structures. This confirms earlier conclusions on the loop-fold structure of globular proteins. Many of the topmost conserved modules represent the primary closed loop prototypes, that have been derived by whole genome sequence searches. The data presented, thus, make a basis for further developments toward the earliest stages of protein evolution.
基于蛋白质序列基序在物种间的普遍存在,最近引入了序列保守性的通用尺度。人们发现,大量短序列(长度可达八个残基)存在于所有或几乎所有原核生物中。这一发现为解决最后一个普遍共同祖先(LUCA)的重建问题引入了一种全新的定量方法。通过结合序列和蛋白质晶体结构数据,本研究勾勒出了具有确定结构和序列、包含普遍存在基序的最保守元件(蛋白质模块)。结构保守的模块包含25 - 30个氨基酸残基,呈现出闭环、环锁结构。这证实了早期关于球状蛋白质环折叠结构的结论。许多最保守的模块代表了通过全基因组序列搜索得出的初级闭环原型。因此,所呈现的数据为蛋白质进化早期阶段的进一步研究奠定了基础。