Suppr超能文献

将网格技术应用于计算密集型的应用生物信息学分析。

Using Grid technology for computationally intensive applied bioinformatics analyses.

作者信息

Andrade Jorge, Berglund Lisa, Uhlén Mathias, Odeberg Jacob

机构信息

Department of Biotechnology, Royal Institute of Technology (KTH), Stockholm, Sweden.

出版信息

In Silico Biol. 2006;6(6):495-504.

Abstract

For several applications and algorithms used in applied bioinformatics, a bottle neck in terms of computational time may arise when scaled up to facilitate analyses of large datasets and databases. Re-codification, algorithm modification or sacrifices in sensitivity and accuracy may be necessary to accommodate for limited computational capacity of single work stations. Grid computing offers an alternative model for solving massive computational problems by parallel execution of existing algorithms and software implementations. We present the implementation of a Grid-aware model for solving computationally intensive bioinformatic analyses exemplified by a blastp sliding window algorithm for whole proteome sequence similarity analysis, and evaluate the performance in comparison with a local cluster and a single workstation. Our strategy involves temporary installations of the BLAST executable and databases on remote nodes at submission, accommodating for dynamic Grid environments as it avoids the need of predefined runtime environments (preinstalled software and databases at specific Grid-nodes). Importantly, the implementation is generic where the BLAST executable can be replaced by other software tools to facilitate analyses suitable for parallelisation. This model should be of general interest in applied bioinformatics. Scripts and procedures are freely available from the authors.

摘要

对于应用生物信息学中使用的多种应用程序和算法而言,当扩大规模以促进对大型数据集和数据库进行分析时,可能会出现计算时间方面的瓶颈。为了适应单个工作站有限的计算能力,可能需要重新编码、修改算法或在灵敏度和准确性方面做出牺牲。网格计算提供了一种替代模型,通过并行执行现有算法和软件实现来解决大规模计算问题。我们展示了一种用于解决计算密集型生物信息学分析的网格感知模型的实现,以全蛋白质组序列相似性分析的blastp滑动窗口算法为例,并与本地集群和单个工作站进行性能比较评估。我们的策略包括在提交时在远程节点上临时安装BLAST可执行文件和数据库,以适应动态网格环境,因为它避免了对预定义运行时环境(特定网格节点上预先安装的软件和数据库)的需求。重要的是,该实现具有通用性,其中BLAST可执行文件可以被其他软件工具替换,以促进适合并行化的分析。这种模型在应用生物信息学中应具有普遍的意义。作者可免费提供脚本和程序。

相似文献

2
The use of grid computing to drive data-intensive genetic research.利用网格计算推动数据密集型基因研究。
Eur J Hum Genet. 2007 Jun;15(6):694-702. doi: 10.1038/sj.ejhg.5201815. Epub 2007 Mar 21.
4
Squid - a simple bioinformatics grid.鱿鱼——一个简单的生物信息学网格。
BMC Bioinformatics. 2005 Aug 3;6:197. doi: 10.1186/1471-2105-6-197.

引用本文的文献

1
The epitope space of the human proteome.人类蛋白质组的表位空间。
Protein Sci. 2008 Apr;17(4):606-13. doi: 10.1110/ps.073347208.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验