• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

MEGAHIT v1.0:一种由先进方法和社区实践驱动的快速且可扩展的宏基因组组装工具。

MEGAHIT v1.0: A fast and scalable metagenome assembler driven by advanced methodologies and community practices.

作者信息

Li Dinghua, Luo Ruibang, Liu Chi-Man, Leung Chi-Ming, Ting Hing-Fung, Sadakane Kunihiko, Yamashita Hiroshi, Lam Tak-Wah

机构信息

Department of Computer Science, University of Hong Kong, Hong Kong.

Department of Computer Science, University of Hong Kong, Hong Kong; L3 Bioinformatics Limited, Hong Kong.

出版信息

Methods. 2016 Jun 1;102:3-11. doi: 10.1016/j.ymeth.2016.02.020. Epub 2016 Mar 21.

DOI:10.1016/j.ymeth.2016.02.020
PMID:27012178
Abstract

The study of metagenomics has been much benefited from low-cost and high-throughput sequencing technologies, yet the tremendous amount of data generated make analysis like de novo assembly to consume too much computational resources. In late 2014 we released MEGAHIT v0.1 (together with a brief note of Li et al. (2015) [1]), which is the first NGS metagenome assembler that can assemble genome sequences from metagenomic datasets of hundreds of Giga base-pairs (bp) in a time- and memory-efficient manner on a single server. The core of MEGAHIT is an efficient parallel algorithm for constructing succinct de Bruijn Graphs (SdBG), implemented on a graphical processing unit (GPU). The software has been well received by the assembly community, and there is interest in how to adapt the algorithms to integrate popular assembly practices so as to improve the assembly quality, as well as how to speed up the software using better CPU-based algorithms (instead of GPU). In this paper we first describe the details of the core algorithms in MEGAHIT v0.1, and then we show the new modules to upgrade MEGAHIT to version v1.0, which gives better assembly quality, runs faster and uses less memory. For the Iowa Prairie Soil dataset (252Gbp after quality trimming), the assembly quality of MEGAHIT v1.0, when compared with v0.1, has a significant improvement, namely, 36% increase in assembly size and 23% in N50. More interestingly, MEGAHIT v1.0 is no slower than before (even running with the extra modules). This is primarily due to a new CPU-based algorithm for SdBG construction that is faster and requires less memory. Using CPU only, MEGAHIT v1.0 can assemble the Iowa Prairie Soil sample in about 43h, reducing the running time of v0.1 by at least 25% and memory usage by up to 50%. MEGAHIT v1.0, exhibiting a smaller memory footprint, can process even larger datasets. The Kansas Prairie Soil sample (484Gbp), the largest publicly available dataset, can now be assembled using no more than 500GB of memory in 7.5days. The assemblies of these datasets (and other large metgenomic datasets), as well as the software, are available at the website https://hku-bal.github.io/megabox.

摘要

宏基因组学的研究从低成本、高通量测序技术中受益匪浅,然而,由此产生的海量数据使得从头组装等分析消耗过多的计算资源。2014年末,我们发布了MEGAHIT v0.1(同时附带Li等人(2015年)[1]的简短说明),它是首个能在单台服务器上以高效利用时间和内存的方式,从数百吉碱基对(bp)的宏基因组数据集中组装基因组序列的二代测序(NGS)宏基因组组装器。MEGAHIT的核心是一种用于构建简洁德布鲁因图(SdBG)的高效并行算法,该算法在图形处理单元(GPU)上实现。该软件已受到组装领域的广泛认可,人们感兴趣的是如何调整算法以整合流行的组装方法来提高组装质量,以及如何使用更好的基于CPU的算法(而非GPU)来加速软件。在本文中,我们首先描述MEGAHIT v0.1中核心算法的细节,然后展示将MEGAHIT升级到v1.0版本的新模块,v1.0版本具有更好的组装质量、更快的运行速度且内存使用更少。对于爱荷华草原土壤数据集(质量修剪后为252Gbp),与v0.1相比,MEGAHIT v1.0的组装质量有显著提升,即组装大小增加36%,N50增加23%。更有趣的是,MEGAHIT v1.0并不比以前慢(即使运行额外的模块)。这主要归功于一种新的基于CPU的SdBG构建算法,该算法更快且所需内存更少。仅使用CPU,MEGAHIT v1.0就能在约43小时内组装爱荷华草原土壤样本,将v0.1的运行时间至少减少25%,内存使用最多减少50%。MEGAHIT v1.0内存占用更小,能够处理甚至更大的数据集。堪萨斯草原土壤样本(484Gbp),即最大的公开可用数据集,现在可以在7.5天内使用不超过500GB的内存进行组装。这些数据集(以及其他大型宏基因组数据集)的组装结果和该软件可在网站https://hku-bal.github.io/megabox上获取。

相似文献

1
MEGAHIT v1.0: A fast and scalable metagenome assembler driven by advanced methodologies and community practices.MEGAHIT v1.0:一种由先进方法和社区实践驱动的快速且可扩展的宏基因组组装工具。
Methods. 2016 Jun 1;102:3-11. doi: 10.1016/j.ymeth.2016.02.020. Epub 2016 Mar 21.
2
MegaGTA: a sensitive and accurate metagenomic gene-targeted assembler using iterative de Bruijn graphs.MegaGTA:一种使用迭代德布鲁因图的灵敏且准确的宏基因组基因靶向组装器。
BMC Bioinformatics. 2017 Oct 16;18(Suppl 12):408. doi: 10.1186/s12859-017-1825-3.
3
MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph.MEGAHIT:通过简洁的 de Bruijn 图实现的超快速单节点解决方案,适用于大型和复杂的宏基因组组装。
Bioinformatics. 2015 May 15;31(10):1674-6. doi: 10.1093/bioinformatics/btv033. Epub 2015 Jan 20.
4
ViraPipe: scalable parallel pipeline for viral metagenome analysis from next generation sequencing reads.ViraPipe:用于从下一代测序读取中进行病毒宏基因组分析的可扩展并行管道。
Bioinformatics. 2018 Mar 15;34(6):928-935. doi: 10.1093/bioinformatics/btx702.
5
Practical evaluation of 11 de novo assemblers in metagenome assembly.宏基因组组装中11种从头组装程序的实际评估
J Microbiol Methods. 2018 Aug;151:99-105. doi: 10.1016/j.mimet.2018.06.007. Epub 2018 Jun 25.
6
MetaCAA: A clustering-aided methodology for efficient assembly of metagenomic datasets.MetaCAA:一种用于宏基因组数据集高效组装的聚类辅助方法。
Genomics. 2014 Feb-Mar;103(2-3):161-8. doi: 10.1016/j.ygeno.2014.02.007. Epub 2014 Mar 5.
7
Comparison of de-novo assembly tools for plasmid metagenome analysis.比较用于质粒宏基因组分析的从头组装工具。
Genes Genomics. 2019 Sep;41(9):1077-1083. doi: 10.1007/s13258-019-00839-1. Epub 2019 Jun 11.
8
Faucet: streaming de novo assembly graph construction.Faucet:从头开始的流装配图构建。
Bioinformatics. 2018 Jan 1;34(1):147-154. doi: 10.1093/bioinformatics/btx471.
9
Assembling metagenomes, one community at a time.一次组装一个群落的宏基因组。
BMC Genomics. 2017 Jul 10;18(1):521. doi: 10.1186/s12864-017-3918-9.
10
Fragmentation and Coverage Variation in Viral Metagenome Assemblies, and Their Effect in Diversity Calculations.病毒宏基因组组装中的碎片化和覆盖度变化,及其对多样性计算的影响。
Front Bioeng Biotechnol. 2015 Sep 17;3:141. doi: 10.3389/fbioe.2015.00141. eCollection 2015.

引用本文的文献

1
Comparative genomics reveals details into the metabolism of peritrich ciliates (Ciliophora, Oligohymenophorea and Peritrichia).比较基因组学揭示了周丛生纤毛虫(纤毛门、寡膜纲和周丛生目)代谢的细节。
Microb Genom. 2025 Sep;11(9). doi: 10.1099/mgen.0.001472.
2
One mother for two species via obligate cross-species cloning in ants.蚂蚁通过专性跨物种克隆实现两个物种共享一位蚁后。
Nature. 2025 Sep 3. doi: 10.1038/s41586-025-09425-w.
3
Gene Surfing: An efficient and versatile tool for targeted enzyme mining in metagenomics.基因冲浪:宏基因组学中用于靶向酶挖掘的一种高效且通用的工具。
Synth Syst Biotechnol. 2025 Jul 21;10(4):1377-1387. doi: 10.1016/j.synbio.2025.07.006. eCollection 2025 Dec.
4
Unique plastisphere viromes with habitat-dependent potential for modulating global methane cycle.具有依赖栖息地调节全球甲烷循环潜力的独特塑料球病毒群落。
Nat Commun. 2025 Aug 29;16(1):8098. doi: 10.1038/s41467-025-63215-6.
5
Exploring the Functional Potential of the Broiler Gut Microbiome Using Shotgun Metagenomics.利用鸟枪法宏基因组学探索肉鸡肠道微生物组的功能潜力
Genes (Basel). 2025 Aug 11;16(8):946. doi: 10.3390/genes16080946.
6
The resident gut microbiome modulates the effect of synbiotics on the immunogenicity after SARS-COV-2 vaccination in elderly and diabetes patients.肠道常驻微生物群调节了合生元对老年和糖尿病患者接种新型冠状病毒疫苗后免疫原性的影响。
NPJ Biofilms Microbiomes. 2025 Aug 25;11(1):171. doi: 10.1038/s41522-025-00804-9.
7
Plasticity of the gut microbiome of golden snub-nosed monkeys (Rhinopithecus roxellana) in response to seasonal variation in diet.川金丝猴(Rhinopithecus roxellana)肠道微生物群对饮食季节性变化的可塑性。
NPJ Biofilms Microbiomes. 2025 Aug 22;11(1):169. doi: 10.1038/s41522-025-00806-7.
8
Metagenomic analysis reveals how multiple stressors disrupt virus-host interactions in multi-trophic freshwater mesocosms.宏基因组分析揭示了多种压力源如何破坏多营养级淡水微宇宙中的病毒-宿主相互作用。
Nat Commun. 2025 Aug 21;16(1):7806. doi: 10.1038/s41467-025-63162-2.
9
Virome characteristics of small mammals and their associated environments in pastoral areas on the Qinghai-Tibet Plateau.青藏高原牧区小型哺乳动物及其相关环境的病毒群落特征
NPJ Biofilms Microbiomes. 2025 Aug 22;11(1):168. doi: 10.1038/s41522-025-00814-7.
10
Lytic bacteriophages targeting multidrug-resistant Pseudomonas aeruginosa in Moschus berezovskii: isolation, characterization, and therapeutic efficacy against bacteremia.靶向马麝体内耐多药铜绿假单胞菌的裂解性噬菌体:分离、鉴定及对菌血症的治疗效果
Virol J. 2025 Aug 19;22(1):285. doi: 10.1186/s12985-025-02715-9.