Department of Bioinformatics, Biocenter, University of Würzburg, Am Hubland 97074 Wuerzburg, Germany.
Nucleic Acids Res. 2010 Jan;38(Database issue):D275-9. doi: 10.1093/nar/gkp966. Epub 2009 Nov 17.
The internal transcribed spacer 2 (ITS2) is a widely used phylogenetic marker. In the past, it has mainly been used for species level classifications. Nowadays, a wider applicability becomes apparent. Here, the conserved structure of the RNA molecule plays a vital role. We have developed the ITS2 Database (http://its2.bioapps.biozentrum.uni-wuerzburg.de) which holds information about sequence, structure and taxonomic classification of all ITS2 in GenBank. In the new version, we use Hidden Markov models (HMMs) for the identification and delineation of the ITS2 resulting in a major redesign of the annotation pipeline. This allowed the identification of more than 160,000 correct full length and more than 50,000 partial structures. In the web interface, these can now be searched with a modified BLAST considering both sequence and structure, enabling rapid taxon sampling. Novel sequences can be annotated using the HMM based approach and modelled according to multiple template structures. Sequences can be searched for known and newly identified motifs. Together, the database and the web server build an exhaustive resource for ITS2 based phylogenetic analyses.
内转录间隔区 2(ITS2)是一种广泛应用的系统发育标记物。过去,它主要用于物种水平的分类。如今,其应用范围变得更加广泛。在这里,RNA 分子的保守结构起着至关重要的作用。我们开发了 ITS2 数据库(http://its2.bioapps.biozentrum.uni-wuerzburg.de),其中包含 GenBank 中所有 ITS2 的序列、结构和分类学信息。在新版本中,我们使用隐马尔可夫模型(HMM)来识别和划定 ITS2,从而对注释管道进行了重大重新设计。这使得能够识别超过 160000 个正确的全长和超过 50000 个部分结构。在网络界面中,现在可以使用同时考虑序列和结构的改进型 BLAST 进行搜索,从而实现快速的分类群采样。可以使用基于 HMM 的方法对新序列进行注释,并根据多个模板结构进行建模。可以搜索已知和新识别的基序。总而言之,该数据库和网络服务器为基于 ITS2 的系统发育分析构建了一个详尽的资源。