Department of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, PO Box 5003, 1432 Ås, Norway.
Microbes Environ. 2013;28(2):211-6. doi: 10.1264/jsme2.me12157. Epub 2013 Apr 20.
We addressed the challenges of analyzing next-generation 16S rRNA gene deep sequencing data from the uncharacterized microbial majority. This was performed using a novel de novo semi-alignment approach. The semi-alignments were based on Orthologous Tri-Nucleotides (OTNs), which are identical trinucleotides located in the same sequence region. OTNs in high error homopolymeric tracts were excluded to avoid overestimation of genetic distances. Phylogenetic information was derived assuming an exponential decay in shared OTNs between pairs of bacteria. OTN relatedness was also explored through principal component analysis (PCA). In evaluating the OTN approach we reanalyzed a dataset consisting of triplicate GS FLX titanium pyrosequencing runs for each of two experimental soil samples, in addition to analyses of the Greengenes core dataset. The conclusion from these comparisons was that the OTN approach was superior to traditional alignments both with respect to speed and accuracy. We therefore believe that our OTN-based semi-alignment approach will be a valuable contribution to future exploration of deep sequencing data.
我们解决了分析未鉴定微生物多数的下一代 16S rRNA 基因深度测序数据的挑战。这是使用一种新颖的从头 semi-alignment 方法来实现的。semi-alignments 是基于 Orthologous Tri-Nucleotides (OTNs) 的,OTNs 是位于相同序列区域的相同三核苷酸。为了避免遗传距离的高估,排除了高错误同聚物中的 OTNs。假设细菌之间共享 OTNs 呈指数衰减,从而推导出系统发育信息。还通过主成分分析 (PCA) 探索了 OTN 相关性。在评估 OTN 方法时,我们重新分析了两个实验土壤样本的每个样本的三组 GS FLX 钛焦磷酸测序运行的数据集,此外还分析了 Greengenes 核心数据集。这些比较的结论是,OTN 方法在速度和准确性方面都优于传统的比对方法。因此,我们相信我们基于 OTN 的 semi-alignment 方法将对未来的深度测序数据探索做出有价值的贡献。