Gossamer--一种资源高效的从头组装程序。

Gossamer--a resource-efficient de novo assembler.

机构信息

NICTA Victoria Research Laboratory, Department of Computing and Information Systems, The University of Melbourne, Parkville, Victoria 3010, Australia.

出版信息

Bioinformatics. 2012 Jul 15;28(14):1937-8. doi: 10.1093/bioinformatics/bts297. Epub 2012 May 18.

DOI:10.1093/bioinformatics/bts297

PMID:22611131

Abstract

MOTIVATION

The de novo assembly of short read high-throughput sequencing data poses significant computational challenges. The volume of data is huge; the reads are tiny compared to the underlying sequence, and there are significant numbers of sequencing errors. There are numerous software packages that allow users to assemble short reads, but most are either limited to relatively small genomes (e.g. bacteria) or require large computing infrastructure or employ greedy algorithms and thus often do not yield high-quality results.

RESULTS

We have developed Gossamer, an implementation of the de Bruijn approach to assembly that requires close to the theoretical minimum of memory, but still allows efficient processing. Our results show that it is space efficient and produces high-quality assemblies.

AVAILABILITY

Gossamer is available for non-commercial use from http://www.genomics.csse.unimelb.edu.au/product-gossamer.php.

摘要

动机

从头组装短读高通量测序数据带来了巨大的计算挑战。数据量巨大；与潜在序列相比，读取非常小，并且存在大量测序错误。有许多软件包允许用户组装短读，但大多数软件包要么仅限于相对较小的基因组（例如细菌），要么需要大型计算基础设施，要么采用贪婪算法，因此通常无法产生高质量的结果。

结果

我们开发了 Gossamer，这是一种实现 de Bruijn 组装方法的软件，它需要接近理论上最小的内存，但仍允许高效处理。我们的结果表明，它具有空间效率并且产生高质量的组装。

可用性

Gossamer 可从 http://www.genomics.csse.unimelb.edu.au/product-gossamer.php 非商业使用。

相似文献

Gossamer--a resource-efficient de novo assembler.

Bioinformatics. 2012 Jul 15;28(14):1937-8. doi: 10.1093/bioinformatics/bts297. Epub 2012 May 18.

QuorUM: An Error Corrector for Illumina Reads.

PLoS One. 2015 Jun 17;10(6):e0130821. doi: 10.1371/journal.pone.0130821. eCollection 2015.

When less is more: 'slicing' sequencing data improves read decoding accuracy and de novo assembly quality.

Bioinformatics. 2015 Sep 15;31(18):2972-80. doi: 10.1093/bioinformatics/btv311. Epub 2015 May 20.

TraRECo: a greedy approach based de novo transcriptome assembler with read error correction using consensus matrix.

BMC Genomics. 2018 Sep 4;19(1):653. doi: 10.1186/s12864-018-5034-x.

BFC: correcting Illumina sequencing errors.

Bioinformatics. 2015 Sep 1;31(17):2885-7. doi: 10.1093/bioinformatics/btv290. Epub 2015 May 6.

FinisherSC: a repeat-aware tool for upgrading de novo assembly using long reads.

Bioinformatics. 2015 Oct 1;31(19):3207-9. doi: 10.1093/bioinformatics/btv280. Epub 2015 Jun 3.

The present and future of de novo whole-genome assembly.

Brief Bioinform. 2018 Jan 1;19(1):23-40. doi: 10.1093/bib/bbw096.

Assembly of long error-prone reads using de Bruijn graphs.

Proc Natl Acad Sci U S A. 2016 Dec 27;113(52):E8396-E8405. doi: 10.1073/pnas.1604560113. Epub 2016 Dec 12.

Subset selection of high-depth next generation sequencing reads for de novo genome assembly using MapReduce framework.

BMC Genomics. 2015;16 Suppl 12(Suppl 12):S9. doi: 10.1186/1471-2164-16-S12-S9. Epub 2015 Dec 9.

NeatFreq: reference-free data reduction and coverage normalization for De Novo sequence assembly.

BMC Bioinformatics. 2014 Nov 19;15(1):357. doi: 10.1186/s12859-014-0357-3.

引用本文的文献

Conway-Bromage-Lyndon (CBL): an exact, dynamic representation of k-mer sets.

Bioinformatics. 2024 Jun 28;40(Suppl 1):i48-i57. doi: 10.1093/bioinformatics/btae217.

Spherical: an iterative workflow for assembling metagenomic datasets.

BMC Bioinformatics. 2018 Jan 24;19(1):20. doi: 10.1186/s12859-018-2028-2.

Evaluation of the impact of Illumina error correction tools on de novo genome assembly.

BMC Bioinformatics. 2017 Aug 18;18(1):374. doi: 10.1186/s12859-017-1784-8.

Genomic characterisation of Eμ-Myc mouse lymphomas identifies Bcor as a Myc co-operative tumour-suppressor gene.

Nat Commun. 2017 Mar 6;8:14581. doi: 10.1038/ncomms14581.

BLESS 2: accurate, memory-efficient and fast error correction method.

Bioinformatics. 2016 Aug 1;32(15):2369-71. doi: 10.1093/bioinformatics/btw146. Epub 2016 Mar 24.

Host Subtraction, Filtering and Assembly Validations for Novel Viral Discovery Using Next Generation Sequencing Data.

PLoS One. 2015 Jun 22;10(6):e0129059. doi: 10.1371/journal.pone.0129059. eCollection 2015.

Metagenomics of rumen bacteriophage from thirteen lactating dairy cattle.

BMC Microbiol. 2013 Nov 1;13:242. doi: 10.1186/1471-2180-13-242.

TIGER: tiled iterative genome assembler.

BMC Bioinformatics. 2012;13 Suppl 19(Suppl 19):S18. doi: 10.1186/1471-2105-13-S19-S18. Epub 2012 Dec 19.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Gossamer--一种资源高效的从头组装程序。

Gossamer--a resource-efficient de novo assembler.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY

动机

结果

可用性

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献