Wagner Günter P, Fried Claudia, Prohaska Sonja J, Stadler Peter F
Department of Ecology and Evolutionary Biology Yale University, New Haven, Connecticut, USA.
Mol Biol Evol. 2004 Nov;21(11):2116-21. doi: 10.1093/molbev/msh221. Epub 2004 Jul 28.
In many eukaryotic genomes only a small fraction of the DNA codes for proteins, but the non-protein coding DNA harbors important genetic elements directing the development and the physiology of the organisms, like promoters, enhancers, insulators, and micro-RNA genes. The molecular evolution of these genetic elements is difficult to study because their functional significance is hard to deduce from sequence information alone. Here we propose an approach to the study of the rate of evolution of functional non-coding sequences at a macro-evolutionary scale. We identify functionally important non-coding sequences as Conserved Non-Coding Nucleotide (CNCN) sequences from the comparison of two outgroup species. The CNCN sequences so identified are then compared to their homologous sequences in a pair of ingroup species, and we monitor the degree of modification these sequences suffered in the two ingroup lineages. We propose a method to test for rate differences in the modification of CNCN sequences among the two ingroup lineages, as well as a method to estimate their rate of modification. We apply this method to the full sequences of the HoxA clusters from six gnathostome species: a shark, Heterodontus francisci; a basal ray finned fish, Polypterus senegalus; the amphibian, Xenopus tropicalis; as well as three mammalian species, human, rat and mouse. The results show that the evolutionary rate of CNCN sequences is not distinguishable among the three mammalian lineages, while the Xenopus lineage has a significantly increased rate of evolution. Furthermore the estimates of the rate parameters suggest that in the stem lineage of mammals the rate of CNCN sequence evolution was more than twice the rate observed within the placental amniotes clade, suggesting a high rate of evolution of cis-regulatory elements during the origin of amniotes and mammals. We conclude that the proposed methods can be used for testing hypotheses about the rate and pattern of evolution of putative cis-regulatory elements.
在许多真核生物基因组中,只有一小部分DNA编码蛋白质,但非蛋白质编码DNA包含指导生物体发育和生理的重要遗传元件,如启动子、增强子、绝缘子和微小RNA基因。这些遗传元件的分子进化很难研究,因为仅从序列信息很难推断出它们的功能意义。在这里,我们提出了一种在宏观进化尺度上研究功能性非编码序列进化速率的方法。我们通过比较两个外类群物种,将功能重要的非编码序列鉴定为保守非编码核苷酸(CNCN)序列。然后将如此鉴定出的CNCN序列与其在一对内类群物种中的同源序列进行比较,并监测这些序列在两个内类群谱系中发生修饰的程度。我们提出了一种方法来测试两个内类群谱系中CNCN序列修饰的速率差异,以及一种估计其修饰速率的方法。我们将这种方法应用于六种有颌类物种的HoxA簇的完整序列:一种鲨鱼,宽纹虎鲨;一种基干硬骨鱼,塞内加尔多鳍鱼;两栖动物,热带爪蟾;以及三种哺乳动物物种,人类、大鼠和小鼠。结果表明,在三个哺乳动物谱系中,CNCN序列的进化速率没有差异,而爪蟾谱系的进化速率显著增加。此外,速率参数的估计表明,在哺乳动物的干群谱系中,CNCN序列进化速率是胎盘羊膜类分支内观察到的速率的两倍多,这表明在羊膜动物和哺乳动物起源期间,顺式调控元件的进化速率很高。我们得出结论,所提出的方法可用于检验关于假定顺式调控元件进化速率和模式的假设。