Karpievitch Yuliya V, Almeida Jonas S
Department of Biostatistics, Bioinformatics and Epidemiology, Medical University of South Carolina, Charleston, SC 29425, USA.
BMC Bioinformatics. 2006 Mar 15;7:139. doi: 10.1186/1471-2105-7-139.
Matlab, a powerful and productive language that allows for rapid prototyping, modeling and simulation, is widely used in computational biology. Modeling and simulation of large biological systems often require more computational resources then are available on a single computer. Existing distributed computing environments like the Distributed Computing Toolbox, MatlabMPI, Matlab*G and others allow for the remote (and possibly parallel) execution of Matlab commands with varying support for features like an easy-to-use application programming interface, load-balanced utilization of resources, extensibility over the wide area network, and minimal system administration skill requirements. However, all of these environments require some level of access to participating machines to manually distribute the user-defined libraries that the remote call may invoke.
mGrid augments the usual process distribution seen in other similar distributed systems by adding facilities for user code distribution. mGrid's client-side interface is an easy-to-use native Matlab toolbox that transparently executes user-defined code on remote machines (i.e. the user is unaware that the code is executing somewhere else). Run-time variables are automatically packed and distributed with the user-defined code and automated load-balancing of remote resources enables smooth concurrent execution. mGrid is an open source environment. Apart from the programming language itself, all other components are also open source, freely available tools: light-weight PHP scripts and the Apache web server.
Transparent, load-balanced distribution of user-defined Matlab toolboxes and rapid prototyping of many simple parallel applications can now be done with a single easy-to-use Matlab command. Because mGrid utilizes only Matlab, light-weight PHP scripts and the Apache web server, installation and configuration are very simple. Moreover, the web-based infrastructure of mGrid allows for it to be easily extensible over the Internet.
Matlab是一种强大且高效的语言,支持快速原型设计、建模与仿真,在计算生物学中广泛应用。大型生物系统的建模与仿真通常需要比单台计算机所能提供的更多计算资源。现有的分布式计算环境,如分布式计算工具箱、MatlabMPI、Matlab*G等,允许远程(可能并行)执行Matlab命令,对诸如易于使用的应用程序编程接口、资源的负载均衡利用、广域网的可扩展性以及最低系统管理技能要求等功能有不同程度的支持。然而,所有这些环境都需要对参与的机器有一定程度的访问权限,以便手动分发远程调用可能会调用的用户定义库。
mGrid通过添加用户代码分发功能,增强了其他类似分布式系统中常见的进程分发。mGrid的客户端接口是一个易于使用的原生Matlab工具箱,可在远程机器上透明地执行用户定义代码(即用户不知道代码在其他地方执行)。运行时变量会自动与用户定义代码一起打包和分发,并且远程资源的自动负载均衡实现了平滑的并发执行。mGrid是一个开源环境。除了编程语言本身,所有其他组件也是开源的、免费可用的工具:轻量级PHP脚本和Apache Web服务器。
现在可以通过一个易于使用的Matlab命令,实现用户定义的Matlab工具箱的透明、负载均衡分发以及许多简单并行应用程序的快速原型设计。由于mGrid仅使用Matlab、轻量级PHP脚本和Apache Web服务器,安装和配置非常简单。此外,mGrid基于Web的基础设施使其能够轻松地通过互联网进行扩展。