Tentacle:宏基因组中基因的分布式定量分析
Tentacle: distributed quantification of genes in metagenomes.
作者信息
Boulund Fredrik, Sjögren Anders, Kristiansson Erik
机构信息
Division of Statistics, Department of Mathematical Sciences, Chalmers University of Technology and University of Gothenburg, Gothenburg, Sweden.
出版信息
Gigascience. 2015 Sep 7;4:40. doi: 10.1186/s13742-015-0078-1. eCollection 2015.
BACKGROUND
In metagenomics, microbial communities are sequenced at increasingly high resolution, generating datasets with billions of DNA fragments. Novel methods that can efficiently process the growing volumes of sequence data are necessary for the accurate analysis and interpretation of existing and upcoming metagenomes.
FINDINGS
Here we present Tentacle, which is a novel framework that uses distributed computational resources for gene quantification in metagenomes. Tentacle is implemented using a dynamic master-worker approach in which DNA fragments are streamed via a network and processed in parallel on worker nodes. Tentacle is modular, extensible, and comes with support for six commonly used sequence aligners. It is easy to adapt Tentacle to different applications in metagenomics and easy to integrate into existing workflows.
CONCLUSIONS
Evaluations show that Tentacle scales very well with increasing computing resources. We illustrate the versatility of Tentacle on three different use cases. Tentacle is written for Linux in Python 2.7 and is published as open source under the GNU General Public License (v3). Documentation, tutorials, installation instructions, and the source code are freely available online at: http://bioinformatics.math.chalmers.se/tentacle.
背景
在宏基因组学中,微生物群落的测序分辨率越来越高,产生了包含数十亿个DNA片段的数据集。对于准确分析和解读现有及未来的宏基因组,需要能够有效处理不断增长的序列数据量的新方法。
研究结果
在此,我们展示了Tentacle,这是一个利用分布式计算资源对宏基因组中的基因进行定量分析的新框架。Tentacle采用动态主从方法实现,其中DNA片段通过网络进行流传输,并在工作节点上并行处理。Tentacle具有模块化、可扩展的特点,并支持六种常用的序列比对器。将Tentacle应用于宏基因组学的不同应用很容易,并且易于集成到现有工作流程中。
结论
评估表明,Tentacle随着计算资源的增加扩展性非常好。我们在三个不同的用例中展示了Tentacle的多功能性。Tentacle是用Python 2.7为Linux编写的,并根据GNU通用公共许可证(第3版)作为开源软件发布。文档、教程、安装说明和源代码可在以下网址免费在线获取:http://bioinformatics.math.chalmers.se/tentacle。
相似文献
Gigascience. 2015-9-7
Bioinformatics. 2022-10-14
BMC Bioinformatics. 2015-2-28
BMC Genomics. 2017-4-21
BMC Bioinformatics. 2020-6-22
Bioinformatics. 2016-8-4
PLoS One. 2012-10-17
BMC Bioinformatics. 2015-4-28
PLoS One. 2015-11-11
引用本文的文献
BMC Genomics. 2018-4-20
BMC Genomics. 2017-4-21
本文引用的文献
BMC Biol. 2014-8-22
Bioinformatics. 2014-4-18
PLoS One. 2013-8-23
BMC Bioinformatics. 2013-6-7
Nucleic Acids Res. 2012-11-29
Nat Methods. 2012-10-28
Brief Bioinform. 2012-9-8
Bioinformatics. 2012-8-24