Bio-sciences R&D Division, TCS Innovation Labs, Tata Consultancy Services Limited, Madhapur, Hyderabad, Andhra Pradesh, India.
Bioinformatics. 2011 Jan 1;27(1):22-30. doi: 10.1093/bioinformatics/btq608. Epub 2010 Oct 28.
Compared with composition-based binning algorithms, the binning accuracy and specificity of alignment-based binning algorithms is significantly higher. However, being alignment-based, the latter class of algorithms require enormous amount of time and computing resources for binning huge metagenomic datasets. The motivation was to develop a binning approach that can analyze metagenomic datasets as rapidly as composition-based approaches, but nevertheless has the accuracy and specificity of alignment-based algorithms. This article describes a hybrid binning approach (SPHINX) that achieves high binning efficiency by utilizing the principles of both 'composition'- and 'alignment'-based binning algorithms.
Validation results with simulated sequence datasets indicate that SPHINX is able to analyze metagenomic sequences as rapidly as composition-based algorithms. Furthermore, the binning efficiency (in terms of accuracy and specificity of assignments) of SPHINX is observed to be comparable with results obtained using alignment-based algorithms.
A web server for the SPHINX algorithm is available at http://metagenomics.atc.tcs.com/SPHINX/.
与基于组合的分箱算法相比,基于比对的分箱算法的分箱准确性和特异性显著更高。然而,由于基于比对,后者类算法需要大量的时间和计算资源来对庞大的宏基因组数据集进行分箱。动机是开发一种分箱方法,该方法能够像基于组合的方法一样快速地分析宏基因组数据集,但仍然具有基于比对算法的准确性和特异性。本文描述了一种混合分箱方法(SPHINX),该方法通过利用基于“组合”和“比对”的分箱算法的原理来实现高效的分箱。
使用模拟序列数据集的验证结果表明,SPHINX 能够像基于组合的算法一样快速地分析宏基因组序列。此外,观察到 SPHINX 的分箱效率(在分配的准确性和特异性方面)与使用基于比对的算法获得的结果相当。
SPHINX 算法的 Web 服务器可在 http://metagenomics.atc.tcs.com/SPHINX/ 上获得。