Meraculous：使用短配对末端读取进行从头基因组组装。

Meraculous: de novo genome assembly with short paired-end reads.

机构信息

U.S. Department of Energy Joint Genome Institute, Walnut Creek, California, United States of America.

出版信息

PLoS One. 2011;6(8):e23501. doi: 10.1371/journal.pone.0023501. Epub 2011 Aug 18.

DOI:10.1371/journal.pone.0023501

PMID:21876754

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3158087/

Abstract

We describe a new algorithm, meraculous, for whole genome assembly of deep paired-end short reads, and apply it to the assembly of a dataset of paired 75-bp Illumina reads derived from the 15.4 megabase genome of the haploid yeast Pichia stipitis. More than 95% of the genome is recovered, with no errors; half the assembled sequence is in contigs longer than 101 kilobases and in scaffolds longer than 269 kilobases. Incorporating fosmid ends recovers entire chromosomes. Meraculous relies on an efficient and conservative traversal of the subgraph of the k-mer (deBruijn) graph of oligonucleotides with unique high quality extensions in the dataset, avoiding an explicit error correction step as used in other short-read assemblers. A novel memory-efficient hashing scheme is introduced. The resulting contigs are ordered and oriented using paired reads separated by ∼280 bp or ∼3.2 kbp, and many gaps between contigs can be closed using paired-end placements. Practical issues with the dataset are described, and prospects for assembling larger genomes are discussed.

摘要

我们描述了一种新的算法 meraculous，用于深度配对短读的全基因组组装，并将其应用于来自单倍体酵母 Pichia stipitis 的 15.4 兆碱基基因组的 75 碱基对 Illumina 配对读取数据集的组装。超过 95%的基因组被回收，没有错误；组装序列的一半在长度超过 101 千碱基的 contigs 和长度超过 269 千碱基的 scaffolds 中。整合 fosmid 末端可回收整个染色体。Meraculous 依赖于对数据集具有独特高质量延伸的寡核苷酸的 k-mer（deBruijn）图子图的有效和保守遍历，避免了其他短读序列组装器中使用的显式纠错步骤。引入了一种新颖的内存高效哈希方案。使用分离约 280 bp 或约 3.2 kbp 的成对读取对生成的 contigs 进行排序和定向，并且可以使用成对放置来闭合许多 contigs 之间的间隙。描述了数据集的实际问题，并讨论了组装更大基因组的前景。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05e0/3158087/cda627c9f8ab/pone.0023501.g001.jpg

相似文献

Meraculous: de novo genome assembly with short paired-end reads.Meraculous：使用短配对末端读取进行从头基因组组装。

PLoS One. 2011;6(8):e23501. doi: 10.1371/journal.pone.0023501. Epub 2011 Aug 18.

Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches.用于纳米孔数据的从头组装算法基准测试揭示了重叠布局一致（OLC）方法的最佳性能。

BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):507. doi: 10.1186/s12864-016-2895-8.

Fine de novo sequencing of a fungal genome using only SOLiD short read data: verification on Aspergillus oryzae RIB40.仅使用 SOLiD 短读数据进行真菌基因组从头测序：以米曲霉 RIB40 为例的验证。

PLoS One. 2013 May 7;8(5):e63673. doi: 10.1371/journal.pone.0063673. Print 2013.

Illumina error correction near highly repetitive DNA regions improves de novo genome assembly.Illumina 纠错技术在高度重复 DNA 区域的应用提高了从头基因组组装的质量。

BMC Bioinformatics. 2019 Jun 3;20(1):298. doi: 10.1186/s12859-019-2906-2.

Identification of optimum sequencing depth especially for de novo genome assembly of small genomes using next generation sequencing data.利用下一代测序数据鉴定最佳测序深度，特别是对于从头组装小基因组的应用。

PLoS One. 2013 Apr 12;8(4):e60204. doi: 10.1371/journal.pone.0060204. Print 2013.

Evaluation of short read metagenomic assembly.短读宏基因组组装评估。

BMC Genomics. 2011;12 Suppl 2(Suppl 2):S8. doi: 10.1186/1471-2164-12-S2-S8. Epub 2011 Jul 27.

Paired de bruijn graphs: a novel approach for incorporating mate pair information into genome assemblers.配对德布鲁因图：一种将配对末端信息整合到基因组组装工具中的新方法。

J Comput Biol. 2011 Nov;18(11):1625-34. doi: 10.1089/cmb.2011.0151. Epub 2011 Oct 14.

Local de novo assembly of RAD paired-end contigs using short sequencing reads.使用短测序读长进行 RAD 配对末端 contigs 的本地从头组装。

PLoS One. 2011 Apr 13;6(4):e18561. doi: 10.1371/journal.pone.0018561.

Chromosome-level de novo assembly of Coprinopsis cinerea A43mut B43mut pab1-1 #326 and genetic variant identification of mutants using Nanopore MinION sequencing.A43mut B43mut pab1-1 #326 型 Coprinopsis cinerea 染色体水平从头组装及 Nanopore MinION 测序鉴定突变体的遗传变异

Fungal Genet Biol. 2021 Jan;146:103485. doi: 10.1016/j.fgb.2020.103485. Epub 2020 Nov 27.

SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing.SHARCGS，一种用于从头基因组测序的快速且高度准确的短读长拼接算法。

Genome Res. 2007 Nov;17(11):1697-706. doi: 10.1101/gr.6435207. Epub 2007 Oct 1.

引用本文的文献

The genomic origin of the unique chaetognath body plan.独特箭虫身体结构的基因组起源。

Nature. 2025 Aug 13. doi: 10.1038/s41586-025-09403-2.

De-novo assembly of four rail (Aves: Rallidae) genomes: A resource for comparative genomics.四种秧鸡（鸟类：秧鸡科）基因组的从头组装：比较基因组学的资源

Ecol Evol. 2024 Jul 18;14(7):e11694. doi: 10.1002/ece3.11694. eCollection 2024 Jul.

Post-meiotic mechanism of facultative parthenogenesis in gonochoristic whiptail lizard species.有性生殖的涡蜥物种中兼性孤雌生殖的减数分裂后机制。

Elife. 2024 Jun 7;13:e97035. doi: 10.7554/eLife.97035.

Acceleration of genome rearrangement in clitellate annelids.环节动物门寡毛纲动物基因组重排的加速

bioRxiv. 2024 May 14:2024.05.12.593736. doi: 10.1101/2024.05.12.593736.

The hagfish genome and the evolution of vertebrates.八目鳗基因组与脊椎动物演化。

Nature. 2024 Mar;627(8005):811-820. doi: 10.1038/s41586-024-07070-3. Epub 2024 Jan 23.

Conserved chromatin and repetitive patterns reveal slow genome evolution in frogs.保守的染色质和重复模式揭示了青蛙缓慢的基因组进化。

Nat Commun. 2024 Jan 17;15(1):579. doi: 10.1038/s41467-023-43012-9.

Reference genome of the nutrition-rich orphan crop chia () and its implications for future breeding.营养丰富的小众作物奇亚籽（）的参考基因组及其对未来育种的意义。

Front Plant Sci. 2023 Dec 14;14:1272966. doi: 10.3389/fpls.2023.1272966. eCollection 2023.

A chromosome-level reference genome for the common octopus, Octopus vulgaris (Cuvier, 1797).普通章鱼（Cuvier, 1797）的染色体水平参考基因组。

G3 (Bethesda). 2023 Dec 6;13(12). doi: 10.1093/g3journal/jkad220.

Chromosome-level genome assemblies of two parasitoid biocontrol wasps reveal the parthenogenesis mechanism and an associated novel virus.两种寄生性生物防治黄蜂的染色体水平基因组组装揭示了孤雌生殖机制和一种相关的新型病毒。

BMC Genomics. 2023 Aug 5;24(1):440. doi: 10.1186/s12864-023-09538-4.

Chromosomal-level reference genome of a wild North American mallard (Anas platyrhynchos).野生北美绿头鸭（Anas platyrhynchos）的染色体水平参考基因组。

G3 (Bethesda). 2023 Sep 30;13(10). doi: 10.1093/g3journal/jkad171.

本文引用的文献

Complete Khoisan and Bantu genomes from southern Africa.完成来自南非的科伊桑和班图人的全基因组。

Nature. 2010 Feb 18;463(7283):943-7. doi: 10.1038/nature08795.

Pebble and rock band: heuristic resolution of repeats and scaffolding in the velvet short-read de novo assembler.卵石和摇滚乐队：绒毛短读从头组装中的重复和支架的启发式解析。

PLoS One. 2009 Dec 22;4(12):e8407. doi: 10.1371/journal.pone.0008407.

De novo assembly of human genomes with massively parallel short read sequencing.利用大规模平行短读测序进行人类基因组从头组装。

Genome Res. 2010 Feb;20(2):265-72. doi: 10.1101/gr.097261.109. Epub 2009 Dec 17.

The sequence and de novo assembly of the giant panda genome.大熊猫基因组的序列与从头组装。

Nature. 2010 Jan 21;463(7279):311-7. doi: 10.1038/nature08696. Epub 2009 Dec 13.

Sequencing technologies - the next generation.测序技术——下一代。

Nat Rev Genet. 2010 Jan;11(1):31-46. doi: 10.1038/nrg2626. Epub 2009 Dec 8.

Sense from sequence reads: methods for alignment and assembly.从序列读取中获取意义：比对和组装方法

Nat Methods. 2009 Nov;6(11 Suppl):S6-S12. doi: 10.1038/nmeth.1376.

ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads.ALLPATHS 2：使用短配对读取准确且高度连续地组装小基因组。

Genome Biol. 2009;10(10):R103. doi: 10.1186/gb-2009-10-10-r103. Epub 2009 Oct 1.

Genome assembly reborn: recent computational challenges.基因组组装重生：近期的计算挑战

Brief Bioinform. 2009 Jul;10(4):354-66. doi: 10.1093/bib/bbp026. Epub 2009 May 29.

ABySS: a parallel assembler for short read sequence data.ABySS：一种用于短读长序列数据的并行汇编器。

Genome Res. 2009 Jun;19(6):1117-23. doi: 10.1101/gr.089532.108. Epub 2009 Feb 27.

De novo fragment assembly with short mate-paired reads: Does the read length matter?利用短配对末端读段进行从头片段组装：读段长度重要吗？

Genome Res. 2009 Feb;19(2):336-46. doi: 10.1101/gr.079053.108. Epub 2008 Dec 3.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

Meraculous：使用短配对末端读取进行从头基因组组装。

Meraculous: de novo genome assembly with short paired-end reads.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献