Schultz Jörg, Maisel Stefanie, Gerlach Daniel, Müller Tobias, Wolf Matthias
Department of Bioinformatics, University of Würzburg, Biocenter, Am Hubland, D-97074 Würzburg, Germany.
RNA. 2005 Apr;11(4):361-4. doi: 10.1261/rna.7204505.
The ongoing characterization of novel species creates the need for a molecular marker which can be used for species- and, simultaneously, for mega-systematics. Recently, the use of the internal transcribed spacer 2 (ITS2) sequence was suggested, as it shows a high divergence in sequence with an assumed conservation in structure. This hypothesis was mainly based on small-scale analyses, comparing a limited number of sequences. Here, we report a large-scale analysis of more than 54,000 currently known ITS2 sequences with the goal to evaluate the hypothesis of a conserved structural core and to assess its use for automated large-scale phylogenetics. Structure prediction revealed that the previously described core structure can be found for more than 5000 sequences in a wide variety of taxa within the eukaryotes, indicating that the core secondary structure is indeed conserved. This conserved structure allowed an automated alignment of extremely divergent sequences as exemplified for the ITS2 sequences of a ctenophorean eumetazoon and a volvocalean green alga. All classified sequences, together with their structures can be accessed at http://www.biozentrum.uni-wuerzburg.de/bioinformatik/projects/ITS2.html. Furthermore, we found that, although sample sequences are known for most major taxa, there exists a profound divergence in coverage, which might become a hindrance for general usage. In summary, our analysis strengthens the potential of ITS2 as a general phylogenetic marker and provides a data source for further ITS2-based analyses.
对新物种的持续特征描述产生了对一种分子标记的需求,该标记可用于物种鉴定以及同时用于宏观系统学研究。最近,有人提出使用内部转录间隔区2(ITS2)序列,因为它在序列上具有高度差异,且假定其结构保守。这一假设主要基于小规模分析,比较的序列数量有限。在此,我们报告了对超过54000条目前已知的ITS2序列的大规模分析,目的是评估保守结构核心的假设,并评估其在自动化大规模系统发育学中的应用。结构预测表明,在真核生物的多种分类群中,超过5000条序列可找到先前描述的核心结构,这表明核心二级结构确实保守。这种保守结构使得能够对极度差异的序列进行自动比对,栉水母动物门真后生动物和团藻目绿藻的ITS2序列就是例证。所有分类序列及其结构可在http://www.biozentrum.uni-wuerzburg.de/bioinformatik/projects/ITS2.html上获取。此外,我们发现,尽管大多数主要分类群的样本序列已知,但在覆盖范围上存在很大差异,这可能成为其普遍应用的障碍。总之,我们的分析强化了ITS2作为通用系统发育标记的潜力,并为进一步基于ITS2的分析提供了数据源。