Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, 332 NSRC, 1101 W Peabody Drive, Urbana, IL 61801, USA.
J Mol Evol. 2011 Jan;72(1):14-33. doi: 10.1007/s00239-010-9400-9. Epub 2010 Nov 17.
The origin of life has puzzled molecular scientists for over half a century. Yet fundamental questions remain unanswered, including which came first, the metabolic machinery or the encoding nucleic acids. In this study we take a protein-centric view and explore the ancestral origins of proteins. Protein domain structures in proteomes are highly conserved and embody molecular functions and interactions that are needed for cellular and organismal processes. Here we use domain structure to study the evolution of molecular function in the protein world. Timelines describing the age and function of protein domains at fold, fold superfamily, and fold family levels of structural complexity were derived from a structural phylogenomic census in hundreds of fully sequenced genomes. These timelines unfold congruent hourglass patterns in rates of appearance of domain structures and functions, functional diversity, and hierarchical complexity, and revealed a gradual build up of protein repertoires associated with metabolism, translation and DNA, in that order. The most ancient domain architectures were hydrolase enzymes and the first translation domains had catalytic functions for the aminoacylation and the molecular switch-driven transport of RNA. Remarkably, the most ancient domains had metabolic roles, did not interact with RNA, and preceded the gradual build-up of translation. In fact, the first translation domains had also a metabolic origin and were only later followed by specialized translation machinery. Our results explain how the generation of structure in the protein world and the concurrent crystallization of translation and diversified cellular life created further opportunities for proteomic diversification.
生命的起源让分子科学家困惑了半个多世纪。然而,仍有一些基本问题尚未得到解答,包括代谢机制和编码核酸哪个先出现。在这项研究中,我们从蛋白质中心的角度探索了蛋白质的祖先起源。蛋白质组中的蛋白质结构域具有高度的保守性,体现了细胞和生物体过程所需的分子功能和相互作用。在这里,我们使用结构域结构来研究蛋白质世界中分子功能的进化。描述蛋白质结构域在折叠、折叠超家族和折叠家族结构复杂性水平上的年龄和功能的时间线,是从数百个完全测序的基因组中进行结构系统发育普查得出的。这些时间线在结构域结构和功能、功能多样性和层次复杂性的出现率上展开了一致的沙漏模式,并揭示了与代谢、翻译和 DNA 相关的蛋白质库逐渐增加的顺序。最古老的结构域架构是水解酶,第一个翻译结构域具有催化功能,用于氨酰化和 RNA 的分子开关驱动运输。值得注意的是,最古老的结构域具有代谢作用,不与 RNA 相互作用,并且先于翻译的逐渐增加。事实上,第一个翻译结构域也具有代谢起源,只是后来才出现专门的翻译机制。我们的结果解释了蛋白质世界中结构的产生,以及翻译和多样化的细胞生命的同时结晶,如何为蛋白质组多样化创造了更多的机会。