Center of System Biomedical Sciences, University of Shanghai for Science and Technology, Shanghai 200093, P R China.
Bioinformatics. 2011 Aug 1;27(15):2031-7. doi: 10.1093/bioinformatics/btr319. Epub 2011 Jun 2.
Several new de novo assembly tools have been developed recently to assemble short sequencing reads generated by next-generation sequencing platforms. However, the performance of these tools under various conditions has not been fully investigated, and sufficient information is not currently available for informed decisions to be made regarding the tool that would be most likely to produce the best performance under a specific set of conditions.
We studied and compared the performance of commonly used de novo assembly tools specifically designed for next-generation sequencing data, including SSAKE, VCAKE, Euler-sr, Edena, Velvet, ABySS and SOAPdenovo. Tools were compared using several performance criteria, including N50 length, sequence coverage and assembly accuracy. Various properties of read data, including single-end/paired-end, sequence GC content, depth of coverage and base calling error rates, were investigated for their effects on the performance of different assembly tools. We also compared the computation time and memory usage of these seven tools. Based on the results of our comparison, the relative performance of individual tools are summarized and tentative guidelines for optimal selection of different assembly tools, under different conditions, are provided.
最近开发了几种新的从头组装工具,用于组装下一代测序平台生成的短测序reads。然而,这些工具在各种条件下的性能尚未得到充分研究,目前没有足够的信息来做出明智的决策,即选择最有可能在特定条件下产生最佳性能的工具。
我们专门研究和比较了几种常用的从头组装工具,这些工具专门为下一代测序数据设计,包括 SSAKE、VCAKE、Euler-sr、Edena、Velvet、ABySS 和 SOAPdenovo。使用多个性能标准对工具进行了比较,包括 N50 长度、序列覆盖率和组装准确性。我们还研究了读段数据的各种特性,包括单端/双端、序列 GC 含量、覆盖深度和碱基调用错误率,以了解它们对不同组装工具性能的影响。我们还比较了这七种工具的计算时间和内存使用情况。根据我们比较的结果,总结了各个工具的相对性能,并提供了在不同条件下选择不同组装工具的初步指南。