National Collection of Yeast Cultures, Institute of Food Research, Norwich Research Park, Norwich NR4 7UA, UK; Bioinformatics, The Genome Analysis Centre, Norwich Research Park, Norwich NR4 7UH, UK; Department of Computational and Systems Biology, John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK.
National Collection of Yeast Cultures, Institute of Food Research, Norwich Research Park, Norwich NR4 7UA, UK; Bioinformatics, The Genome Analysis Centre, Norwich Research Park, Norwich NR4 7UH, UK; Department of Computational and Systems Biology, John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UKNational Collection of Yeast Cultures, Institute of Food Research, Norwich Research Park, Norwich NR4 7UA, UK; Bioinformatics, The Genome Analysis Centre, Norwich Research Park, Norwich NR4 7UH, UK; Department of Computational and Systems Biology, John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK.
Syst Biol. 2014 Jul;63(4):543-54. doi: 10.1093/sysbio/syu019. Epub 2014 Mar 27.
The ribosomal RNA encapsulates a wealth of evolutionary information, including genetic variation that can be used to discriminate between organisms at a wide range of taxonomic levels. For example, the prokaryotic 16S rDNA sequence is very widely used both in phylogenetic studies and as a marker in metagenomic surveys and the internal transcribed spacer region, frequently used in plant phylogenetics, is now recognized as a fungal DNA barcode. However, this widespread use does not escape criticism, principally due to issues such as difficulties in classification of paralogous versus orthologous rDNA units and intragenomic variation, both of which may be significant barriers to accurate phylogenetic inference. We recently analyzed data sets from the Saccharomyces Genome Resequencing Project, characterizing rDNA sequence variation within multiple strains of the baker's yeast Saccharomyces cerevisiae and its nearest wild relative Saccharomyces paradoxus in unprecedented detail. Notably, both species possess single locus rDNA systems. Here, we use these new variation datasets to assess whether a more detailed characterization of the rDNA locus can alleviate the second of these phylogenetic issues, sequence heterogeneity, while controlling for the first. We demonstrate that a strong phylogenetic signal exists within both datasets and illustrate how they can be used, with existing methodology, to estimate intraspecies phylogenies of yeast strains consistent with those derived from whole-genome approaches. We also describe the use of partial Single Nucleotide Polymorphisms, a type of sequence variation found only in repetitive genomic regions, in identifying key evolutionary features such as genome hybridization events and show their consistency with whole-genome Structure analyses. We conclude that our approach can transform rDNA sequence heterogeneity from a problem to a useful source of evolutionary information, enabling the estimation of highly accurate phylogenies of closely related organisms, and discuss how it could be extended to future studies of multilocus rDNA systems. [concerted evolution; genome hydridisation; phylogenetic analysis; ribosomal DNA; whole genome sequencing; yeast].
核糖体 RNA 包含了丰富的进化信息,包括可用于在广泛的分类学水平上区分生物体的遗传变异。例如,原核生物 16S rDNA 序列在系统发育研究中以及在宏基因组调查中作为标记非常广泛地使用,而内转录间隔区,常用于植物系统发育学,现在被认为是真菌 DNA 条形码。然而,这种广泛的使用并没有逃脱批评,主要是由于分类学上的一些问题,如同源与非同源 rDNA 单位的分类困难以及基因组内变异,这两者都可能是准确的系统发育推断的重大障碍。我们最近分析了来自酿酒酵母基因组重测序项目的数据,以空前的细节描述了面包酵母酿酒酵母及其最接近的野生亲缘酵母酿酒酵母的 rDNA 序列变异。值得注意的是,这两个物种都具有单基因座 rDNA 系统。在这里,我们使用这些新的变异数据集来评估更详细的 rDNA 基因座特征是否可以缓解第二个分类学问题,即序列异质性,同时控制第一个问题。我们证明了这两个数据集都存在强烈的系统发育信号,并说明了如何使用现有的方法来估计酵母菌株的种内系统发育,这些方法与来自全基因组方法的结果一致。我们还描述了部分单核苷酸多态性的使用,这是一种仅在重复基因组区域中发现的序列变异类型,用于识别关键的进化特征,如基因组杂交事件,并展示了它们与全基因组结构分析的一致性。我们得出结论,我们的方法可以将 rDNA 序列异质性从一个问题转化为有用的进化信息来源,从而能够估计密切相关的生物体的高度准确的系统发育,并且讨论了如何将其扩展到未来的多基因座 rDNA 系统研究中。[协同进化;基因组杂交;系统发育分析;核糖体 DNA;全基因组测序;酵母]。