Suppr超能文献

CL-Dash:用于云环境中生物信息学研究的Hadoop集群的快速配置与部署

cl-dash: rapid configuration and deployment of Hadoop clusters for bioinformatics research in the cloud.

作者信息

Hodor Paul, Chawla Amandeep, Clark Andrew, Neal Lauren

机构信息

Booz Allen Hamilton, Rockville, MD 20852, USA.

出版信息

Bioinformatics. 2016 Jan 15;32(2):301-3. doi: 10.1093/bioinformatics/btv553. Epub 2015 Oct 1.

Abstract

UNLABELLED

: One of the solutions proposed for addressing the challenge of the overwhelming abundance of genomic sequence and other biological data is the use of the Hadoop computing framework. Appropriate tools are needed to set up computational environments that facilitate research of novel bioinformatics methodology using Hadoop. Here, we present cl-dash, a complete starter kit for setting up such an environment. Configuring and deploying new Hadoop clusters can be done in minutes. Use of Amazon Web Services ensures no initial investment and minimal operation costs. Two sample bioinformatics applications help the researcher understand and learn the principles of implementing an algorithm using the MapReduce programming pattern.

AVAILABILITY AND IMPLEMENTATION

Source code is available at https://bitbucket.org/booz-allen-sci-comp-team/cl-dash.git.

CONTACT

hodor_paul@bah.com.

摘要

未标注

为应对基因组序列和其他生物数据海量丰富带来的挑战而提出的解决方案之一是使用Hadoop计算框架。需要合适的工具来搭建便于利用Hadoop研究新型生物信息学方法的计算环境。在此,我们展示了cl-dash,这是一个用于搭建此类环境的完整入门套件。配置和部署新的Hadoop集群只需几分钟。使用亚马逊网络服务可确保无需初始投资且运营成本最低。两个生物信息学示例应用程序可帮助研究人员理解和学习使用MapReduce编程模式实现算法的原理。

可用性与实现

源代码可在https://bitbucket.org/booz-allen-sci-comp-team/cl-dash.git获取。

联系方式

hodor_paul@bah.com

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd67/4708102/b9fa903d2746/btv553f1p.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验