Gregor I, Schönhuth A, McHardy A C
Department of Algorithmic Bioinformatics, Heinrich-Heine-University Düsseldorf, Düsseldorf 40225, Germany Computational Biology of Infection Research, Helmholtz Center for Infection Research, Braunschweig 38124, Germany.
Centrum Wiskunde & Informatica, Amsterdam, XG 1098, The Netherlands.
Bioinformatics. 2016 Sep 1;32(17):i649-i657. doi: 10.1093/bioinformatics/btw426.
Gene assembly is an important step in functional analysis of shotgun metagenomic data. Nonetheless, strain aware assembly remains a challenging task, as current assembly tools often fail to distinguish among strain variants or require closely related reference genomes of the studied species to be available.
We have developed Snowball, a novel strain aware gene assembler for shotgun metagenomic data that does not require closely related reference genomes to be available. It uses profile hidden Markov models (HMMs) of gene domains of interest to guide the assembly. Our assembler performs gene assembly of individual gene domains based on read overlaps and error correction using read quality scores at the same time, which results in very low per-base error rates.
The software runs on a user-defined number of processor cores in parallel, runs on a standard laptop and is available under the GPL 3.0 license for installation under Linux or OS X at https://github.com/hzi-bifo/snowball
AMC14@helmholtz-hzi.de,a.schoenhuth@cwi.nl
Supplementary data are available at Bioinformatics online.
基因组装是鸟枪法宏基因组数据功能分析中的重要步骤。尽管如此,菌株感知组装仍然是一项具有挑战性的任务,因为当前的组装工具常常无法区分菌株变体,或者需要有研究物种的密切相关参考基因组。
我们开发了Snowball,一种用于鸟枪法宏基因组数据的新型菌株感知基因组装器,它不需要有密切相关的参考基因组。它使用感兴趣的基因结构域的轮廓隐马尔可夫模型(HMM)来指导组装。我们的组装器基于读段重叠并同时使用读段质量得分进行纠错,对单个基因结构域进行基因组装,这导致了极低的每碱基错误率。
该软件在用户定义数量的处理器核心上并行运行,可在标准笔记本电脑上运行,并根据GPL 3.0许可在https://github.com/hzi-bifo/snowball上获取,以便在Linux或OS X下安装。
AMC14@helmholtz-hzi.de,a.schoenhuth@cwi.nl
补充数据可在《生物信息学》在线获取。