Ray Partho Sarothi, Fox Paul L
Department of Cellular and Molecular Medicine, The Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, Unites States of America; Department of Biological Sciences, Indian Institute of Science Education and Research, Kolkata, India.
Department of Cellular and Molecular Medicine, The Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio, Unites States of America.
PLoS One. 2014 Jun 26;9(6):e98493. doi: 10.1371/journal.pone.0098493. eCollection 2014.
Repeated domains in proteins that have undergone duplication or loss, and sequence divergence, are especially informative about phylogenetic relationships. We have exploited divergent repeats of the highly structured, 50-amino acid WHEP domains that join the catalytic subunits of bifunctional glutamyl-prolyl tRNA synthetase (EPRS) as a sequence-informed repeat (SIR) to trace the origin and evolution of EPRS in holozoa. EPRS is the only fused tRNA synthetase, with two distinct aminoacylation activities, and a non-canonical translation regulatory function mediated by the WHEP domains in the linker. Investigating the duplications, deletions and divergence of WHEP domains, we traced the bifunctional EPRS to choanozoans and identified the fusion event leading to its origin at the divergence of ichthyosporea and emergence of filozoa nearly a billion years ago. Distribution of WHEP domains from a single species in two or more distinct clades suggested common descent, allowing the identification of linking organisms. The discrete assortment of choanoflagellate WHEP domains with choanozoan domains as well as with those in metazoans supported the phylogenetic position of choanoflagellates as the closest sister group to metazoans. Analysis of clustering and assortment of WHEP domains provided unexpected insights into phylogenetic relationships amongst holozoan taxa. Furthermore, observed gaps in the transition between WHEP domain groupings in distant taxa allowed the prediction of undiscovered or extinct evolutionary intermediates. Analysis based on SIR domains can provide a phylogenetic counterpart to palaentological approaches of discovering "missing links" in the tree of life.
在经历了复制、缺失和序列分化的蛋白质中,重复结构域对于系统发育关系具有特别重要的信息价值。我们利用了高度结构化的、由50个氨基酸组成的WHEP结构域的发散重复序列,这些结构域连接双功能谷氨酰 - 脯氨酰tRNA合成酶(EPRS)的催化亚基,作为序列信息重复序列(SIR)来追踪全动物界中EPRS的起源和进化。EPRS是唯一一种融合的tRNA合成酶,具有两种不同的氨酰化活性,以及由连接区的WHEP结构域介导的非经典翻译调节功能。通过研究WHEP结构域的复制、缺失和分化,我们将双功能EPRS追溯到领鞭毛虫,并确定了导致其起源的融合事件发生在近十亿年前鱼孢霉分化和丝足虫出现之时。来自单个物种的WHEP结构域在两个或更多不同进化枝中的分布表明它们有共同的祖先,从而能够识别连接生物。领鞭毛虫的WHEP结构域与领鞭毛虫类结构域以及后生动物结构域的离散分类支持了领鞭毛虫作为后生动物最亲近姐妹群的系统发育地位。对WHEP结构域的聚类和分类分析为全动物类群之间的系统发育关系提供了意想不到的见解。此外,在远缘类群的WHEP结构域分组之间观察到的过渡间隙使得能够预测未被发现或已灭绝的进化中间环节。基于SIR结构域的分析可以为在生命之树中发现“缺失环节”的古生物学方法提供系统发育对应物。