Pruesse Elmar, Quast Christian, Knittel Katrin, Fuchs Bernhard M, Ludwig Wolfgang, Peplies Jörg, Glöckner Frank Oliver
Microbial Genomics Group, Max Planck Institute for Marine Microbiology.
Nucleic Acids Res. 2007;35(21):7188-96. doi: 10.1093/nar/gkm864. Epub 2007 Oct 18.
Sequencing ribosomal RNA (rRNA) genes is currently the method of choice for phylogenetic reconstruction, nucleic acid based detection and quantification of microbial diversity. The ARB software suite with its corresponding rRNA datasets has been accepted by researchers worldwide as a standard tool for large scale rRNA analysis. However, the rapid increase of publicly available rRNA sequence data has recently hampered the maintenance of comprehensive and curated rRNA knowledge databases. A new system, SILVA (from Latin silva, forest), was implemented to provide a central comprehensive web resource for up to date, quality controlled databases of aligned rRNA sequences from the Bacteria, Archaea and Eukarya domains. All sequences are checked for anomalies, carry a rich set of sequence associated contextual information, have multiple taxonomic classifications, and the latest validly described nomenclature. Furthermore, two precompiled sequence datasets compatible with ARB are offered for download on the SILVA website: (i) the reference (Ref) datasets, comprising only high quality, nearly full length sequences suitable for in-depth phylogenetic analysis and probe design and (ii) the comprehensive Parc datasets with all publicly available rRNA sequences longer than 300 nucleotides suitable for biodiversity analyses. The latest publicly available database release 91 (August 2007) hosts 547 521 sequences split into 461 823 small subunit and 85 689 large subunit rRNAs.
对核糖体RNA(rRNA)基因进行测序是目前用于系统发育重建、基于核酸的微生物多样性检测和定量分析的首选方法。ARB软件套件及其相应的rRNA数据集已被全球研究人员公认为大规模rRNA分析的标准工具。然而,最近公开可用的rRNA序列数据的快速增长阻碍了全面且经过整理的rRNA知识数据库的维护。一个新的系统SILVA(源自拉丁语silva,意为森林)得以实施,旨在为细菌、古菌和真核生物域中经质量控制的比对rRNA序列的最新数据库提供一个集中的综合网络资源。所有序列都经过异常检查,带有丰富的与序列相关的上下文信息,具有多种分类学分类,以及最新的有效描述命名法。此外,SILVA网站提供两个与ARB兼容的预编译序列数据集供下载:(i)参考(Ref)数据集,仅包含适合深入系统发育分析和探针设计的高质量、近乎全长的序列;(ii)综合Parc数据集,包含所有公开可用的长度超过300个核苷酸的rRNA序列,适用于生物多样性分析。最新的公开数据库版本91(2007年8月)包含547521条序列,分为461823条小亚基和85689条大亚基rRNA。