Suppr超能文献

Meraculous:使用短配对末端读取进行从头基因组组装。

Meraculous: de novo genome assembly with short paired-end reads.

机构信息

U.S. Department of Energy Joint Genome Institute, Walnut Creek, California, United States of America.

出版信息

PLoS One. 2011;6(8):e23501. doi: 10.1371/journal.pone.0023501. Epub 2011 Aug 18.

Abstract

We describe a new algorithm, meraculous, for whole genome assembly of deep paired-end short reads, and apply it to the assembly of a dataset of paired 75-bp Illumina reads derived from the 15.4 megabase genome of the haploid yeast Pichia stipitis. More than 95% of the genome is recovered, with no errors; half the assembled sequence is in contigs longer than 101 kilobases and in scaffolds longer than 269 kilobases. Incorporating fosmid ends recovers entire chromosomes. Meraculous relies on an efficient and conservative traversal of the subgraph of the k-mer (deBruijn) graph of oligonucleotides with unique high quality extensions in the dataset, avoiding an explicit error correction step as used in other short-read assemblers. A novel memory-efficient hashing scheme is introduced. The resulting contigs are ordered and oriented using paired reads separated by ∼280 bp or ∼3.2 kbp, and many gaps between contigs can be closed using paired-end placements. Practical issues with the dataset are described, and prospects for assembling larger genomes are discussed.

摘要

我们描述了一种新的算法 meraculous,用于深度配对短读的全基因组组装,并将其应用于来自单倍体酵母 Pichia stipitis 的 15.4 兆碱基基因组的 75 碱基对 Illumina 配对读取数据集的组装。超过 95%的基因组被回收,没有错误;组装序列的一半在长度超过 101 千碱基的 contigs 和长度超过 269 千碱基的 scaffolds 中。整合 fosmid 末端可回收整个染色体。Meraculous 依赖于对数据集具有独特高质量延伸的寡核苷酸的 k-mer(deBruijn)图子图的有效和保守遍历,避免了其他短读序列组装器中使用的显式纠错步骤。引入了一种新颖的内存高效哈希方案。使用分离约 280 bp 或约 3.2 kbp 的成对读取对生成的 contigs 进行排序和定向,并且可以使用成对放置来闭合许多 contigs 之间的间隙。描述了数据集的实际问题,并讨论了组装更大基因组的前景。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/05e0/3158087/cda627c9f8ab/pone.0023501.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验