• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

比较独立和云基础设施上的内存高效基因组组装器。

Comparing memory-efficient genome assemblers on stand-alone and cloud infrastructures.

机构信息

Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia.

出版信息

PLoS One. 2013 Sep 27;8(9):e75505. doi: 10.1371/journal.pone.0075505. eCollection 2013.

DOI:10.1371/journal.pone.0075505
PMID:24086547
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3785575/
Abstract

A fundamental problem in bioinformatics is genome assembly. Next-generation sequencing (NGS) technologies produce large volumes of fragmented genome reads, which require large amounts of memory to assemble the complete genome efficiently. With recent improvements in DNA sequencing technologies, it is expected that the memory footprint required for the assembly process will increase dramatically and will emerge as a limiting factor in processing widely available NGS-generated reads. In this report, we compare current memory-efficient techniques for genome assembly with respect to quality, memory consumption and execution time. Our experiments prove that it is possible to generate draft assemblies of reasonable quality on conventional multi-purpose computers with very limited available memory by choosing suitable assembly methods. Our study reveals the minimum memory requirements for different assembly programs even when data volume exceeds memory capacity by orders of magnitude. By combining existing methodologies, we propose two general assembly strategies that can improve short-read assembly approaches and result in reduction of the memory footprint. Finally, we discuss the possibility of utilizing cloud infrastructures for genome assembly and we comment on some findings regarding suitable computational resources for assembly.

摘要

生物信息学中的一个基本问题是基因组组装。下一代测序 (NGS) 技术产生大量碎片化的基因组读取,这些读取需要大量内存才能有效地组装完整的基因组。随着 DNA 测序技术的最新改进,预计组装过程所需的内存占用量将大幅增加,并成为处理广泛可用的 NGS 生成的读取的限制因素。在本报告中,我们比较了当前针对基因组组装的内存高效技术在质量、内存消耗和执行时间方面的表现。我们的实验证明,通过选择合适的组装方法,在可用内存非常有限的传统通用计算机上,有可能生成具有合理质量的草案组装。即使数据量超过内存容量几个数量级,我们的研究也揭示了不同组装程序的最小内存要求。通过结合现有的方法学,我们提出了两种通用的组装策略,可以改进短读长组装方法,并减少内存占用。最后,我们讨论了利用云基础设施进行基因组组装的可能性,并对组装适合的计算资源的一些发现进行了评论。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e477/3785575/ee9b313c53b0/pone.0075505.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e477/3785575/ee9b313c53b0/pone.0075505.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e477/3785575/ee9b313c53b0/pone.0075505.g001.jpg

相似文献

1
Comparing memory-efficient genome assemblers on stand-alone and cloud infrastructures.比较独立和云基础设施上的内存高效基因组组装器。
PLoS One. 2013 Sep 27;8(9):e75505. doi: 10.1371/journal.pone.0075505. eCollection 2013.
2
Parallelized short read assembly of large genomes using de Bruijn graphs.使用 de Bruijn 图进行大型基因组的并行短读序列组装。
BMC Bioinformatics. 2011 Aug 25;12:354. doi: 10.1186/1471-2105-12-354.
3
Assembler for de novo assembly of large genomes.从头组装大型基因组的装配器。
Proc Natl Acad Sci U S A. 2013 Sep 3;110(36):E3417-24. doi: 10.1073/pnas.1314090110. Epub 2013 Aug 21.
4
Next-generation sequencing and large genome assemblies.下一代测序和大型基因组组装。
Pharmacogenomics. 2012 Jun;13(8):901-15. doi: 10.2217/pgs.12.72.
5
FastEtch: A Fast Sketch-Based Assembler for Genomes.FastEtch:一种基于草图的快速基因组装配器。
IEEE/ACM Trans Comput Biol Bioinform. 2019 Jul-Aug;16(4):1091-1106. doi: 10.1109/TCBB.2017.2737999. Epub 2017 Sep 11.
6
LR_Gapcloser: a tiling path-based gap closer that uses long reads to complete genome assembly.LR_Gapcloser:一种基于平铺路径的缺口闭合器,它使用长读长来完成基因组组装。
Gigascience. 2019 Jan 1;8(1):giy157. doi: 10.1093/gigascience/giy157.
7
Memory-Efficient Assembly Using Flye.使用Flye进行内存高效组装。
IEEE/ACM Trans Comput Biol Bioinform. 2022 Nov-Dec;19(6):3564-3577. doi: 10.1109/TCBB.2021.3108843. Epub 2022 Dec 8.
8
Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads.单轮循环器:从短读长和长读长测序数据中解析细菌基因组组装结果
PLoS Comput Biol. 2017 Jun 8;13(6):e1005595. doi: 10.1371/journal.pcbi.1005595. eCollection 2017 Jun.
9
Subset selection of high-depth next generation sequencing reads for de novo genome assembly using MapReduce framework.使用MapReduce框架进行从头基因组组装时对高深度下一代测序读数的子集选择。
BMC Genomics. 2015;16 Suppl 12(Suppl 12):S9. doi: 10.1186/1471-2164-16-S12-S9. Epub 2015 Dec 9.
10
Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches.用于纳米孔数据的从头组装算法基准测试揭示了重叠布局一致(OLC)方法的最佳性能。
BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):507. doi: 10.1186/s12864-016-2895-8.

引用本文的文献

1
Benchmarking of bioinformatics tools for the hybrid assembly of human and non-human whole-genome sequencing data.用于人类和非人类全基因组测序数据混合组装的生物信息学工具的基准测试。
Comput Struct Biotechnol J. 2025 Jul 13;27:3099-3109. doi: 10.1016/j.csbj.2025.07.020. eCollection 2025.
2
Challenges in Bioinformatics Workflows for Processing Microbiome Omics Data at Scale.大规模处理微生物组组学数据的生物信息学工作流程中的挑战
Front Bioinform. 2022 Jan 17;1:826370. doi: 10.3389/fbinf.2021.826370. eCollection 2021.
3
Indel Group in Genomes (IGG) Molecular Genetic Markers.

本文引用的文献

1
Space-efficient and exact de Bruijn graph representation based on a Bloom filter.基于布隆过滤器的空间高效且精确的德布鲁因图表示。
Algorithms Mol Biol. 2013 Sep 16;8(1):22. doi: 10.1186/1748-7188-8-22.
2
MOCAT: a metagenomics assembly and gene prediction toolkit.MOCAT:一个宏基因组组装和基因预测工具包。
PLoS One. 2012;7(10):e47656. doi: 10.1371/journal.pone.0047656. Epub 2012 Oct 17.
3
Scaling metagenome sequence assembly with probabilistic de Bruijn graphs.基于概率有向图的宏基因组序列组装规模化方法。
基因组中的插入缺失组(IGG)分子遗传标记
Plant Physiol. 2016 Sep;172(1):38-61. doi: 10.1104/pp.16.00354. Epub 2016 Jul 19.
4
Compacting de Bruijn graphs from sequencing data quickly and in low memory.从测序数据中快速且低内存地压缩德布鲁因图。
Bioinformatics. 2016 Jun 15;32(12):i201-i208. doi: 10.1093/bioinformatics/btw279.
5
The real cost of sequencing: scaling computation to keep pace with data generation.测序的实际成本:扩展计算能力以跟上数据生成的步伐。
Genome Biol. 2016 Mar 23;17:53. doi: 10.1186/s13059-016-0917-0.
6
Parallel computing in genomic research: advances and applications.基因组研究中的并行计算:进展与应用
Adv Appl Bioinform Chem. 2015 Nov 13;8:23-35. doi: 10.2147/AABC.S64482. eCollection 2015.
7
Next-generation sequencing approach for connecting secondary metabolites to biosynthetic gene clusters in fungi.用于将真菌中的次生代谢产物与生物合成基因簇相联系的新一代测序方法。
Front Microbiol. 2015 Jan 14;5:774. doi: 10.3389/fmicb.2014.00774. eCollection 2014.
8
Assessment of de novo assemblers for draft genomes: a case study with fungal genomes.用于基因组草图的从头组装程序评估:以真菌基因组为例的研究
BMC Genomics. 2014;15 Suppl 9(Suppl 9):S10. doi: 10.1186/1471-2164-15-S9-S10. Epub 2014 Dec 8.
9
The odorant receptor co-receptor from the bed bug, Cimex lectularius L.来自臭虫(温带臭虫)的气味受体共受体
PLoS One. 2014 Nov 20;9(11):e113692. doi: 10.1371/journal.pone.0113692. eCollection 2014.
Proc Natl Acad Sci U S A. 2012 Aug 14;109(33):13272-7. doi: 10.1073/pnas.1121464109. Epub 2012 Jul 30.
4
MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads.MetaVelvet:Velvet 组装器的扩展,用于从短序列读取进行从头宏基因组组装。
Nucleic Acids Res. 2012 Nov 1;40(20):e155. doi: 10.1093/nar/gks678. Epub 2012 Jul 19.
5
Exploiting sparseness in de novo genome assembly.从头组装基因组中的稀疏性利用。
BMC Bioinformatics. 2012 Apr 19;13 Suppl 6(Suppl 6):S1. doi: 10.1186/1471-2105-13-S6-S1.
6
Life Technologies promises $1,000 genome.生命技术公司承诺实现1000美元基因组测序目标。
Nat Biotechnol. 2012 Feb 8;30(2):126. doi: 10.1038/nbt0212-126a.
7
Efficient de novo assembly of large genomes using compressed data structures.利用压缩数据结构进行高效的从头基因组组装。
Genome Res. 2012 Mar;22(3):549-56. doi: 10.1101/gr.126953.111. Epub 2011 Dec 7.
8
GAGE: A critical evaluation of genome assemblies and assembly algorithms.盖奇:基因组组装和算法的关键评估。
Genome Res. 2012 Mar;22(3):557-67. doi: 10.1101/gr.131383.111. Epub 2012 Jan 6.
9
How to apply de Bruijn graphs to genome assembly.如何将德布鲁因图应用于基因组组装。
Nat Biotechnol. 2011 Nov 8;29(11):987-91. doi: 10.1038/nbt.2023.
10
Assemblathon 1: a competitive assessment of de novo short read assembly methods.Assemblathon 1:从头开始的短读序列组装方法的竞争性评估。
Genome Res. 2011 Dec;21(12):2224-41. doi: 10.1101/gr.126599.111. Epub 2011 Sep 16.