Lagesen Karin, Hallin Peter, Rødland Einar Andreas, Staerfeldt Hans-Henrik, Rognes Torbjørn, Ussery David W
Centre for Molecular Biology and Neuroscience and Institute of Medical Microbiology, University of Oslo, NO-0027 Oslo, Norway.
Nucleic Acids Res. 2007;35(9):3100-8. doi: 10.1093/nar/gkm160. Epub 2007 Apr 22.
The publication of a complete genome sequence is usually accompanied by annotations of its genes. In contrast to protein coding genes, genes for ribosomal RNA (rRNA) are often poorly or inconsistently annotated. This makes comparative studies based on rRNA genes difficult. We have therefore created computational predictors for the major rRNA species from all kingdoms of life and compiled them into a program called RNAmmer. The program uses hidden Markov models trained on data from the 5S ribosomal RNA database and the European ribosomal RNA database project. A pre-screening step makes the method fast with little loss of sensitivity, enabling the analysis of a complete bacterial genome in less than a minute. Results from running RNAmmer on a large set of genomes indicate that the location of rRNAs can be predicted with a very high level of accuracy. Novel, unannotated rRNAs are also predicted in many genomes. The software as well as the genome analysis results are available at the CBS web server.
完整基因组序列的发表通常伴随着对其基因的注释。与蛋白质编码基因不同,核糖体RNA(rRNA)基因的注释往往很差或不一致。这使得基于rRNA基因的比较研究变得困难。因此,我们为所有生命王国的主要rRNA种类创建了计算预测器,并将它们编译成一个名为RNAmmer的程序。该程序使用在5S核糖体RNA数据库和欧洲核糖体RNA数据库项目的数据上训练的隐马尔可夫模型。一个预筛选步骤使该方法快速且灵敏度损失很小,能够在不到一分钟的时间内分析完整的细菌基因组。在大量基因组上运行RNAmmer的结果表明,rRNA的位置可以以非常高的准确度进行预测。在许多基因组中也预测到了新的、未注释的rRNA。该软件以及基因组分析结果可在CBS网络服务器上获得。