BaseClear B.V., Einsteinweg 5, 2333 CC Leiden, Leiden,The Netherlands.
Bioinformatics. 2011 Feb 15;27(4):578-9. doi: 10.1093/bioinformatics/btq683. Epub 2010 Dec 12.
De novo assembly tools play a main role in reconstructing genomes from next-generation sequencing (NGS) data and usually yield a number of contigs. Using paired-read sequencing data it is possible to assess the order, distance and orientation of contigs and combine them into so-called scaffolds. Although the latter process is a crucial step in finishing genomes, scaffolding algorithms are often built-in functions in de novo assembly tools and cannot be independently controlled. We here present a new tool, called SSPACE, which is a stand-alone scaffolder of pre-assembled contigs using paired-read data. Main features are: a short runtime, multiple library input of paired-end and/or mate pair datasets and possible contig extension with unmapped sequence reads. SSPACE shows promising results on both prokaryote and eukaryote genomic testsets where the amount of initial contigs was reduced by at least 75%.
从头组装工具在从下一代测序(NGS)数据中重建基因组方面发挥着主要作用,通常会产生多个 contigs。使用配对读取测序数据,可以评估 contigs 的顺序、距离和方向,并将它们组合成所谓的 scaffolds。虽然后者是完成基因组的关键步骤,但 scaffolding 算法通常是从头组装工具中的内置功能,无法独立控制。我们在这里介绍一种新工具,称为 SSPACE,它是一种使用配对读取数据对预组装 contigs 进行独立支架的工具。主要特点是:运行时间短,支持多种文库输入的配对末端和/或 mate pair 数据集,并且可以使用未映射的序列读取进行 contig 扩展。SSPACE 在原核生物和真核生物基因组测试集中都显示出了有前景的结果,其中初始 contigs 的数量至少减少了 75%。