Suppr超能文献

比较独立和云基础设施上的内存高效基因组组装器。

Comparing memory-efficient genome assemblers on stand-alone and cloud infrastructures.

机构信息

Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia.

出版信息

PLoS One. 2013 Sep 27;8(9):e75505. doi: 10.1371/journal.pone.0075505. eCollection 2013.

Abstract

A fundamental problem in bioinformatics is genome assembly. Next-generation sequencing (NGS) technologies produce large volumes of fragmented genome reads, which require large amounts of memory to assemble the complete genome efficiently. With recent improvements in DNA sequencing technologies, it is expected that the memory footprint required for the assembly process will increase dramatically and will emerge as a limiting factor in processing widely available NGS-generated reads. In this report, we compare current memory-efficient techniques for genome assembly with respect to quality, memory consumption and execution time. Our experiments prove that it is possible to generate draft assemblies of reasonable quality on conventional multi-purpose computers with very limited available memory by choosing suitable assembly methods. Our study reveals the minimum memory requirements for different assembly programs even when data volume exceeds memory capacity by orders of magnitude. By combining existing methodologies, we propose two general assembly strategies that can improve short-read assembly approaches and result in reduction of the memory footprint. Finally, we discuss the possibility of utilizing cloud infrastructures for genome assembly and we comment on some findings regarding suitable computational resources for assembly.

摘要

生物信息学中的一个基本问题是基因组组装。下一代测序 (NGS) 技术产生大量碎片化的基因组读取,这些读取需要大量内存才能有效地组装完整的基因组。随着 DNA 测序技术的最新改进,预计组装过程所需的内存占用量将大幅增加,并成为处理广泛可用的 NGS 生成的读取的限制因素。在本报告中,我们比较了当前针对基因组组装的内存高效技术在质量、内存消耗和执行时间方面的表现。我们的实验证明,通过选择合适的组装方法,在可用内存非常有限的传统通用计算机上,有可能生成具有合理质量的草案组装。即使数据量超过内存容量几个数量级,我们的研究也揭示了不同组装程序的最小内存要求。通过结合现有的方法学,我们提出了两种通用的组装策略,可以改进短读长组装方法,并减少内存占用。最后,我们讨论了利用云基础设施进行基因组组装的可能性,并对组装适合的计算资源的一些发现进行了评论。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e477/3785575/ee9b313c53b0/pone.0075505.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验