Department of Biology and Biochemistry, University of Houston, Houston, TX, 77204-5001, USA.
Microb Ecol. 2011 Apr;61(3):669-75. doi: 10.1007/s00248-010-9779-2. Epub 2010 Nov 27.
A comparison of variable regions within the 16S rRNA gene is widely used to characterize relationships between bacteria and to identify phylogenetic affiliation of unknown bacteria. In environmental studies, polymerase chain reaction amplification of 16S rRNA followed by cloning and sequencing of numerous individual clones is an extensively used molecular method for elucidating microbial diversity. The sequencing process typically utilizes a forward and reverse primer pair to produce two partial reads (700 to 800 base pairs each) that overlap and in total cover a large region of the full 16S rRNA sequence (1.5 k base). In a typical application, this approach rapidly generates very large numbers of 16S rRNA datasets that can overwhelm manual processing efforts leading to both delays and errors. In particular, the approach presents two computational challenges: (1) the assembly of a composite sequence from the two partial reads and (2) the subsequent appropriate identification of the organism represented by the newly sequenced clones. Herein, we describe a software package, search, trim, identify, track, and capture the uniqueness of 16S rRNAs using public and in-house database (STITCH), which offers automated sequence pair splicing and genetic identification, thus simplifying the computationally intensive analysis of large sequencing libraries. The STITCH software is freely accessible over the Internet at: http://prion.bchs.uh.edu/stitch/.
对 16S rRNA 基因内的可变区进行比较广泛用于描述细菌之间的关系,并确定未知细菌的系统发育归属。在环境研究中,聚合酶链反应扩增 16S rRNA ,然后克隆和测序大量的单个克隆,是一种广泛用于阐明微生物多样性的分子方法。测序过程通常使用正向和反向引物对来产生两个部分读取(每个约 700 到 800 个碱基),这些读取重叠并总共覆盖了全长 16S rRNA 序列(约 1.5 k 个碱基)的很大一部分。在典型的应用中,这种方法可以快速生成大量的 16S rRNA 数据集,这些数据集可能会淹没手动处理工作,导致延迟和错误。特别是,该方法提出了两个计算挑战:(1)从两个部分读取组装复合序列,(2)随后对新测序克隆所代表的生物体进行适当识别。在此,我们描述了一个软件包,search、trim、identify、track 和 capture 16S rRNAs 的独特性使用公共和内部数据库(STITCH),该软件包提供自动序列对拼接和遗传鉴定,从而简化了对大型测序文库的计算密集型分析。STITCH 软件可通过互联网免费访问:http://prion.bchs.uh.edu/stitch/。