一种基于系统发育树的自动化小亚基核糖体RNA分类与比对流程（STAP）。

An automated phylogenetic tree-based small subunit rRNA taxonomy and alignment pipeline (STAP).

作者信息

Wu Dongying, Hartman Amber, Ward Naomi, Eisen Jonathan A

机构信息

UC Davis Genome Center, University of California Davis, Davis, California, United States of America.

出版信息

PLoS One. 2008 Jul 2;3(7):e2566. doi: 10.1371/journal.pone.0002566.

DOI:10.1371/journal.pone.0002566

PMID:18596968

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2432038/

Abstract

Comparative analysis of small-subunit ribosomal RNA (ss-rRNA) gene sequences forms the basis for much of what we know about the phylogenetic diversity of both cultured and uncultured microorganisms. As sequencing costs continue to decline and throughput increases, sequences of ss-rRNA genes are being obtained at an ever-increasing rate. This increasing flow of data has opened many new windows into microbial diversity and evolution, and at the same time has created significant methodological challenges. Those processes which commonly require time-consuming human intervention, such as the preparation of multiple sequence alignments, simply cannot keep up with the flood of incoming data. Fully automated methods of analysis are needed. Notably, existing automated methods avoid one or more steps that, though computationally costly or difficult, we consider to be important. In particular, we regard both the building of multiple sequence alignments and the performance of high quality phylogenetic analysis to be necessary. We describe here our fully-automated ss-rRNA taxonomy and alignment pipeline (STAP). It generates both high-quality multiple sequence alignments and phylogenetic trees, and thus can be used for multiple purposes including phylogenetically-based taxonomic assignments and analysis of species diversity in environmental samples. The pipeline combines publicly-available packages (PHYML, BLASTN and CLUSTALW) with our automatic alignment, masking, and tree-parsing programs. Most importantly, this automated process yields results comparable to those achievable by manual analysis, yet offers speed and capacity that are unattainable by manual efforts.

摘要

小亚基核糖体RNA（ss-rRNA）基因序列的比较分析构成了我们目前对培养和未培养微生物系统发育多样性认知的基础。随着测序成本持续下降以及通量增加，ss-rRNA基因序列的获取速度不断加快。这一不断增长的数据流为微生物多样性和进化打开了许多新窗口，同时也带来了重大的方法学挑战。那些通常需要耗时的人工干预的过程，比如多重序列比对的准备工作，根本无法跟上源源不断涌入的数据。因此需要全自动分析方法。值得注意的是，现有的自动化方法避开了一个或多个步骤，尽管这些步骤计算成本高或难度大，但我们认为它们很重要。特别是，我们认为构建多重序列比对和进行高质量的系统发育分析都是必要的。我们在此描述我们的全自动ss-rRNA分类和比对流程（STAP）。它能生成高质量的多重序列比对和系统发育树，因此可用于多种目的，包括基于系统发育的分类归属以及环境样本中物种多样性的分析。该流程将公开可用的软件包（PHYML、BLASTN和CLUSTALW）与我们的自动比对、屏蔽和树解析程序相结合。最重要的是，这个自动化过程产生的结果与人工分析所能达到的结果相当，但同时提供了人工操作无法企及的速度和能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b85c/2432038/d6aa14879ac7/pone.0002566.g001.jpg

相似文献

An automated phylogenetic tree-based small subunit rRNA taxonomy and alignment pipeline (STAP).

PLoS One. 2008 Jul 2;3(7):e2566. doi: 10.1371/journal.pone.0002566.

Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation.

BMC Bioinformatics. 2021 Aug 12;22(1):400. doi: 10.1186/s12859-021-04316-z.

Phylometrics: a pipeline for inferring phylogenetic trees from a sequence relationship network perspective.

BMC Bioinformatics. 2010 Oct 7;11 Suppl 6(Suppl 6):S18. doi: 10.1186/1471-2105-11-S6-S18.

Two accurate sequence, structure, and phylogenetic template-based RNA alignment systems.

BMC Syst Biol. 2013;7 Suppl 4(Suppl 4):S13. doi: 10.1186/1752-0509-7-S4-S13. Epub 2013 Oct 23.

A standard operating procedure for phylogenetic inference (SOPPI) using (rRNA) marker genes.

Syst Appl Microbiol. 2008 Sep;31(4):251-7. doi: 10.1016/j.syapm.2008.08.003. Epub 2008 Sep 10.

SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB.

Nucleic Acids Res. 2007;35(21):7188-96. doi: 10.1093/nar/gkm864. Epub 2007 Oct 18.

Effects of nucleotide sequence alignment on phylogeny estimation: a case study of 18S rDNAs of apicomplexa.

Mol Biol Evol. 1997 Apr;14(4):428-41. doi: 10.1093/oxfordjournals.molbev.a025779.

SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes.

Bioinformatics. 2012 Jul 15;28(14):1823-9. doi: 10.1093/bioinformatics/bts252. Epub 2012 May 3.

PHYMYCO-DB: a curated database for analyses of fungal diversity and evolution.

PLoS One. 2012;7(9):e43117. doi: 10.1371/journal.pone.0043117. Epub 2012 Sep 13.

EukRef: Phylogenetic curation of ribosomal RNA to enhance understanding of eukaryotic diversity and distribution.

PLoS Biol. 2018 Sep 17;16(9):e2005849. doi: 10.1371/journal.pbio.2005849. eCollection 2018 Sep.

引用本文的文献

Agnostic Framework for the Classification/Identification of Organisms Based on RNA Post-Transcriptional Modifications.

Anal Chem. 2021 Jun 8;93(22):7860-7869. doi: 10.1021/acs.analchem.1c00359. Epub 2021 May 27.

Discovery and Surveillance of Tick-Borne Pathogens.

J Med Entomol. 2021 Jul 16;58(4):1525-1535. doi: 10.1093/jme/tjaa269.

Improved taxonomic assignment of rumen bacterial 16S rRNA sequences using a revised SILVA taxonomic framework.

PeerJ. 2019 Mar 5;7:e6496. doi: 10.7717/peerj.6496. eCollection 2019.

Exploring biogeographic patterns of bacterioplankton communities across global estuaries.

Microbiologyopen. 2019 May;8(5):e00741. doi: 10.1002/mbo3.741. Epub 2018 Oct 10.

ViCTree: an automated framework for taxonomic classification from protein sequences.

Bioinformatics. 2018 Jul 1;34(13):2195-2200. doi: 10.1093/bioinformatics/bty099.

Taxonomic resolutions based on 18S rRNA genes: a case study of subclass copepoda.

PLoS One. 2015 Jun 24;10(6):e0131498. doi: 10.1371/journal.pone.0131498. eCollection 2015.

Determining the culturability of the rumen bacterial microbiome.

Microb Biotechnol. 2014 Sep;7(5):467-79. doi: 10.1111/1751-7915.12141. Epub 2014 Jul 1.

PUmPER: phylogenies updated perpetually.

Bioinformatics. 2014 May 15;30(10):1476-7. doi: 10.1093/bioinformatics/btu053. Epub 2014 Jan 28.

A taxonomic signature of obesity in the microbiome? Getting to the guts of the matter.

PLoS One. 2014 Jan 8;9(1):e84689. doi: 10.1371/journal.pone.0084689. eCollection 2014.

From genus to phylum: large-subunit and internal transcribed spacer rRNA operon regions show similar classification accuracies influenced by database composition.

Appl Environ Microbiol. 2014 Feb;80(3):829-40. doi: 10.1128/AEM.02894-13. Epub 2013 Nov 15.

本文引用的文献

Microbial diversity and the genetic nature of microbial species.

Nat Rev Microbiol. 2008 Jun;6(6):431-40. doi: 10.1038/nrmicro1872. Epub 2008 May 7.

Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy.

Appl Environ Microbiol. 2007 Aug;73(16):5261-7. doi: 10.1128/AEM.00062-07. Epub 2007 Jun 22.

Environmental shotgun sequencing: its potential and challenges for studying the hidden world of microbes.

PLoS Biol. 2007 Mar;5(3):e82. doi: 10.1371/journal.pbio.0050082.

Lineages of acidophilic archaea revealed by community genomic analysis.

Science. 2006 Dec 22;314(5807):1933-5. doi: 10.1126/science.1132690.

The ribosomal database project (RDP-II): introducing myRDP space and quality controlled public data.

Nucleic Acids Res. 2007 Jan;35(Database issue):D169-72. doi: 10.1093/nar/gkl889. Epub 2006 Nov 7.

Use of 16S rRNA and rpoB genes as molecular markers for microbial ecology studies.

Appl Environ Microbiol. 2007 Jan;73(1):278-88. doi: 10.1128/AEM.01177-06. Epub 2006 Oct 27.

Metagenomic analysis of two enhanced biological phosphorus removal (EBPR) sludge communities.

Nat Biotechnol. 2006 Oct;24(10):1263-9. doi: 10.1038/nbt1247. Epub 2006 Sep 24.

Microbial diversity in the deep sea and the underexplored "rare biosphere".

Proc Natl Acad Sci U S A. 2006 Aug 8;103(32):12115-20. doi: 10.1073/pnas.0605127103. Epub 2006 Jul 31.

NAST: a multiple sequence alignment server for comparative analysis of 16S rRNA genes.

Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W394-9. doi: 10.1093/nar/gkl244.

Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB.

Appl Environ Microbiol. 2006 Jul;72(7):5069-72. doi: 10.1128/AEM.03006-05.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种基于系统发育树的自动化小亚基核糖体RNA分类与比对流程（STAP）。

An automated phylogenetic tree-based small subunit rRNA taxonomy and alignment pipeline (STAP).

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献