Bio-sciences R&D Division, TCS Innovation Labs, Tata Research Development & Design Centre, 54-B, Hadapsar Industrial Estate, Pune, 411013, India.
Gene. 2012 Sep 1;505(2):259-65. doi: 10.1016/j.gene.2012.06.014. Epub 2012 Jun 15.
Phylogenetic assignment of individual sequence reads to their respective taxa, referred to as 'taxonomic binning', constitutes a key step of metagenomic analysis. Existing binning methods have limitations either with respect to time or accuracy/specificity of binning. Given these limitations, development of a method that can bin vast amounts of metagenomic sequence data in a rapid, efficient and computationally inexpensive manner can profoundly influence metagenomic analysis in computational resource poor settings. We introduce TWARIT, a hybrid binning algorithm, that employs a combination of short-read alignment and composition-based signature sorting approaches to achieve rapid binning rates without compromising on binning accuracy and specificity. TWARIT is validated with simulated and real-world metagenomes and the results demonstrate significantly lower overall binning times compared to that of existing methods. Furthermore, the binning accuracy and specificity of TWARIT are observed to be comparable/superior to them. A web server implementing TWARIT algorithm is available at http://metagenomics.atc.tcs.com/Twarit/
将个体序列读分配给其各自分类群的系统发育分配,称为“分类学 binning”,是宏基因组分析的关键步骤。现有的 binning 方法在时间或 binning 的准确性/特异性方面存在局限性。鉴于这些限制,开发一种能够快速、高效且在计算上经济高效地对大量宏基因组序列数据进行 binning 的方法,可以深刻影响计算资源匮乏环境中的宏基因组分析。我们引入了 TWARIT,这是一种混合 binning 算法,它采用短读序列比对和基于组成的签名排序方法的组合,实现快速 binning 速度,而不会影响 binning 的准确性和特异性。TWARIT 通过模拟和真实世界的宏基因组进行验证,结果表明与现有方法相比,总体 binning 时间明显缩短。此外,TWARIT 的 binning 准确性和特异性被观察到可与之媲美/优于它们。一个实现 TWARIT 算法的网络服务器可在 http://metagenomics.atc.tcs.com/Twarit/ 上获得。