Suppr超能文献

ALLPATHS:全基因组鸟枪法测序短读段的从头组装。

ALLPATHS: de novo assembly of whole-genome shotgun microreads.

作者信息

Butler Jonathan, MacCallum Iain, Kleber Michael, Shlyakhter Ilya A, Belmonte Matthew K, Lander Eric S, Nusbaum Chad, Jaffe David B

机构信息

Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02141, USA.

出版信息

Genome Res. 2008 May;18(5):810-20. doi: 10.1101/gr.7337908. Epub 2008 Mar 13.

Abstract

New DNA sequencing technologies deliver data at dramatically lower costs but demand new analytical methods to take full advantage of the very short reads that they produce. We provide an initial, theoretical solution to the challenge of de novo assembly from whole-genome shotgun "microreads." For 11 genomes of sizes up to 39 Mb, we generated high-quality assemblies from 80x coverage by paired 30-base simulated reads modeled after real Illumina-Solexa reads. The bacterial genomes of Campylobacter jejuni and Escherichia coli assemble optimally, yielding single perfect contigs, and larger genomes yield assemblies that are highly connected and accurate. Assemblies are presented in a graph form that retains intrinsic ambiguities such as those arising from polymorphism, thereby providing information that has been absent from previous genome assemblies. For both C. jejuni and E. coli, this assembly graph is a single edge encompassing the entire genome. Larger genomes produce more complicated graphs, but the vast majority of the bases in their assemblies are present in long edges that are nearly always perfect. We describe a general method for genome assembly that can be applied to all types of DNA sequence data, not only short read data, but also conventional sequence reads.

摘要

新的DNA测序技术能以低得多的成本产出数据,但需要新的分析方法来充分利用其产生的极短读段。我们针对从全基因组鸟枪法“微读段”进行从头组装的挑战提供了一个初步的理论解决方案。对于大小达39 Mb的11个基因组,我们通过由模拟真实Illumina-Solexa读段构建的配对30碱基读段,从80倍覆盖度生成了高质量组装。空肠弯曲菌和大肠杆菌的细菌基因组组装效果最佳,产生单个完美重叠群,而较大的基因组产生的组装结果高度连通且准确。组装结果以图形形式呈现,保留了诸如由多态性产生的内在模糊性,从而提供了以往基因组组装中所没有的信息。对于空肠弯曲菌和大肠杆菌,这个组装图是一条包含整个基因组的单一边。较大的基因组产生更复杂的图,但它们组装中的绝大多数碱基存在于几乎总是完美的长边上。我们描述了一种可应用于所有类型DNA序列数据的基因组组装通用方法,不仅适用于短读段数据,也适用于传统序列读段。

相似文献

1
ALLPATHS: de novo assembly of whole-genome shotgun microreads.
Genome Res. 2008 May;18(5):810-20. doi: 10.1101/gr.7337908. Epub 2008 Mar 13.
2
Velvet: algorithms for de novo short read assembly using de Bruijn graphs.
Genome Res. 2008 May;18(5):821-9. doi: 10.1101/gr.074492.107. Epub 2008 Mar 18.
3
Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads.
PLoS Comput Biol. 2017 Jun 8;13(6):e1005595. doi: 10.1371/journal.pcbi.1005595. eCollection 2017 Jun.
4
Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches.
BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):507. doi: 10.1186/s12864-016-2895-8.
5
Subset selection of high-depth next generation sequencing reads for de novo genome assembly using MapReduce framework.
BMC Genomics. 2015;16 Suppl 12(Suppl 12):S9. doi: 10.1186/1471-2164-16-S12-S9. Epub 2015 Dec 9.
6
Efficient de novo assembly of single-cell bacterial genomes from short-read data sets.
Nat Biotechnol. 2011 Sep 18;29(10):915-21. doi: 10.1038/nbt.1966.
7
SOPRA: Scaffolding algorithm for paired reads via statistical optimization.
BMC Bioinformatics. 2010 Jun 24;11:345. doi: 10.1186/1471-2105-11-345.
8
Evaluation of short read metagenomic assembly.
BMC Genomics. 2011;12 Suppl 2(Suppl 2):S8. doi: 10.1186/1471-2164-12-S2-S8. Epub 2011 Jul 27.
9
Efficient and accurate whole genome assembly and methylome profiling of E. coli.
BMC Genomics. 2013 Oct 3;14(1):675. doi: 10.1186/1471-2164-14-675.

引用本文的文献

1
Genome analyses suggest recent speciation and postglacial isolation in the Norwegian lemming.
Proc Natl Acad Sci U S A. 2025 Jul 15;122(28):e2424333122. doi: 10.1073/pnas.2424333122. Epub 2025 Jun 30.
3
Symbiont-Mediated Metabolic Shift in the Sea Anemone Anthopleura elegantissima.
Mol Ecol. 2025 Apr;34(8):e17722. doi: 10.1111/mec.17722. Epub 2025 Mar 17.
4
Accurate assembly of full-length consensus for viral quasispecies.
BMC Bioinformatics. 2025 Feb 1;26(1):36. doi: 10.1186/s12859-025-06045-z.
5
Closing the genome of T7902 by long-read nanopore sequencing.
Microbiol Resour Announc. 2025 Jan 16;14(1):e0048424. doi: 10.1128/mra.00484-24. Epub 2024 Dec 10.
6
Introduction to Integrated Proteogenomic Pipeline for Dealing with Pathogenic Missense SNPs.
Methods Mol Biol. 2025;2859:93-107. doi: 10.1007/978-1-0716-4152-1_6.
7
Genome Annotation.
Methods Mol Biol. 2025;2859:21-37. doi: 10.1007/978-1-0716-4152-1_2.
9
Metagenomic and genomic sequences from a nitrate-reducing benzene-degrading enrichment culture.
Microbiol Resour Announc. 2024 Oct 10;13(10):e0029424. doi: 10.1128/mra.00294-24. Epub 2024 Sep 9.
10
Draft genome of an anaerobic nitrate-reducing, benzene-degrading member of the order .
Microbiol Resour Announc. 2024 Oct 10;13(10):e0029524. doi: 10.1128/mra.00295-24. Epub 2024 Aug 27.

本文引用的文献

1
SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing.
Genome Res. 2007 Nov;17(11):1697-706. doi: 10.1101/gr.6435207. Epub 2007 Oct 1.
2
Extending assembly of short DNA sequences to handle error.
Bioinformatics. 2007 Nov 1;23(21):2942-4. doi: 10.1093/bioinformatics/btm451. Epub 2007 Sep 24.
3
Genome-wide maps of chromatin state in pluripotent and lineage-committed cells.
Nature. 2007 Aug 2;448(7153):553-60. doi: 10.1038/nature06008. Epub 2007 Jul 1.
4
Genome-wide mapping of in vivo protein-DNA interactions.
Science. 2007 Jun 8;316(5830):1497-502. doi: 10.1126/science.1141319. Epub 2007 May 31.
5
Assembling millions of short DNA sequences using SSAKE.
Bioinformatics. 2007 Feb 15;23(4):500-1. doi: 10.1093/bioinformatics/btl629. Epub 2006 Dec 8.
6
Accurate multiplex polony sequencing of an evolved bacterial genome.
Science. 2005 Sep 9;309(5741):1728-32. doi: 10.1126/science.1117389. Epub 2005 Aug 4.
7
ARACHNE: a whole-genome shotgun assembler.
Genome Res. 2002 Jan;12(1):177-89. doi: 10.1101/gr.208902.
8
An Eulerian path approach to DNA fragment assembly.
Proc Natl Acad Sci U S A. 2001 Aug 14;98(17):9748-53. doi: 10.1073/pnas.171285098.
10
DNA sequencing with chain-terminating inhibitors.
Proc Natl Acad Sci U S A. 1977 Dec;74(12):5463-7. doi: 10.1073/pnas.74.12.5463.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验