Suppr超能文献

MetaVelvet:Velvet 组装器的扩展,用于从短序列读取进行从头宏基因组组装。

MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads.

机构信息

Department of Biosciences and Informatics, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama 223-8522, Japan.

出版信息

Nucleic Acids Res. 2012 Nov 1;40(20):e155. doi: 10.1093/nar/gks678. Epub 2012 Jul 19.

Abstract

An important step in 'metagenomics' analysis is the assembly of multiple genomes from mixed sequence reads of multiple species in a microbial community. Most conventional pipelines use a single-genome assembler with carefully optimized parameters. A limitation of a single-genome assembler for de novo metagenome assembly is that sequences of highly abundant species are likely misidentified as repeats in a single genome, resulting in a number of small fragmented scaffolds. We extended a single-genome assembler for short reads, known as 'Velvet', to metagenome assembly, which we called 'MetaVelvet', for mixed short reads of multiple species. Our fundamental concept was to first decompose a de Bruijn graph constructed from mixed short reads into individual sub-graphs, and second, to build scaffolds based on each decomposed de Bruijn sub-graph as an isolate species genome. We made use of two features, the coverage (abundance) difference and graph connectivity, for the decomposition of the de Bruijn graph. For simulated datasets, MetaVelvet succeeded in generating significantly higher N50 scores than any single-genome assemblers. MetaVelvet also reconstructed relatively low-coverage genome sequences as scaffolds. On real datasets of human gut microbial read data, MetaVelvet produced longer scaffolds and increased the number of predicted genes.

摘要

在‘宏基因组学’分析中,一个重要步骤是从微生物群落中多个物种的混合序列读段组装多个基因组。大多数传统的管道都使用经过精心优化参数的单基因组组装器。对于从头开始的宏基因组组装,单基因组组装器的一个局限性是,高丰度物种的序列可能被错误地识别为单个基因组中的重复序列,从而导致许多小的碎片化支架。我们扩展了一个用于混合短读的单基因组组装器,称为‘Velvet’,用于混合短读的多物种组装,我们称之为‘MetaVelvet’。我们的基本概念是首先将从混合短读中构建的 de Bruijn 图分解为单独的子图,其次,基于每个分解的 de Bruijn 子图构建支架,作为一个分离物种的基因组。我们利用覆盖(丰度)差异和图连通性这两个特征来分解 de Bruijn 图。对于模拟数据集,MetaVelvet 成功生成的 N50 得分明显高于任何单基因组组装器。MetaVelvet 还将相对低覆盖度的基因组序列重建为支架。在人类肠道微生物阅读数据的真实数据集上,MetaVelvet 生成了更长的支架并增加了预测基因的数量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/34b8/3488206/34a64a118319/gks678f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验