Aravind L, Anantharaman Vivek, Koonin Eugene V
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA.
Proteins. 2002 Jul 1;48(1):1-14. doi: 10.1002/prot.10064.
Protein sequence and structure comparisons show that the catalytic domains of Class I aminoacyl-tRNA synthetases, a related family of nucleotidyltransferases involved primarily in coenzyme biosynthesis, nucleotide-binding domains related to the UspA protein (USPA domains), photolyases, electron transport flavoproteins, and PP-loop-containing ATPases together comprise a distinct class of alpha/beta domains designated the HUP domain after HIGH-signature proteins, UspA, and PP-ATPase. Several lines of evidence are presented to support the monophyly of the HUP domains, to the exclusion of other three-layered alpha/beta folds with the generic "Rossmann-like" topology. Cladistic analysis, with patterns of structural and sequence similarity used as discrete characters, identified three major evolutionary lineages within the HUP domain class: the PP-ATPases; the HIGH superfamily, which includes class I aaRS and related nucleotidyltransferases containing the HIGH signature in their nucleotide-binding loop; and a previously unrecognized USPA-like group, which includes USPA domains, electron transport flavoproteins, and photolyases. Examination of the patterns of phyletic distribution of distinct families within these three major lineages suggests that the Last Universal Common Ancestor of all modern life forms encoded 15-18 distinct alpha/beta ATPases and nucleotide-binding proteins of the HUP class. This points to an extensive radiation of HUP domains before the last universal common ancestor (LUCA), during which the multiple class I aminoacyl-tRNA synthetases emerged only at a late stage. Thus, substantial evolutionary diversification of protein domains occurred well before the modern version of the protein-dependent translation machinery was established, i.e., still in the RNA world.
蛋白质序列和结构比较表明,I类氨酰-tRNA合成酶的催化结构域、主要参与辅酶生物合成的相关核苷酸转移酶家族、与UspA蛋白相关的核苷酸结合结构域(USPA结构域)、光解酶、电子传递黄素蛋白以及含PP环的ATP酶共同构成了一类独特的α/β结构域,该结构域根据HIGH特征蛋白、UspA和PP-ATP酶被命名为HUP结构域。本文提供了多条证据来支持HUP结构域的单系性,以排除其他具有通用“Rossmann样”拓扑结构的三层α/β折叠。系统发育分析以结构和序列相似性模式作为离散特征,在HUP结构域类别中识别出三个主要的进化谱系:PP-ATP酶;HIGH超家族,包括I类氨酰-tRNA合成酶和在其核苷酸结合环中含有HIGH特征的相关核苷酸转移酶;以及一个先前未被识别的USPA样组,包括USPA结构域、电子传递黄素蛋白和光解酶。对这三个主要谱系中不同家族的系统发育分布模式的研究表明,所有现代生命形式的最后一个共同祖先编码了15 - 18种不同的α/β ATP酶和HUP类别的核苷酸结合蛋白。这表明在最后一个共同祖先(LUCA)之前,HUP结构域发生了广泛的辐射,在此期间,多种I类氨酰-tRNA合成酶仅在后期出现。因此,蛋白质结构域的大量进化多样化在依赖蛋白质的翻译机制的现代版本建立之前就已经发生,即在RNA世界中就已经发生。