Levesque Marshall J, Ichikawa Kohei, Date Susumu, Haga Jason H
Department of Bioengineering, University of California, San Diego, 9500 Gilman Dr, La Jolla, CA 92093-0435, USA.
Comput Methods Programs Biomed. 2009 Jan;93(1):73-82. doi: 10.1016/j.cmpb.2008.07.005. Epub 2008 Sep 3.
Grid computing offers the powerful alternative of sharing resources on a worldwide scale, across different institutions to run computationally intensive, scientific applications without the need for a centralized supercomputer. Much effort has been put into development of software that deploys legacy applications on a grid-based infrastructure and efficiently uses available resources. One field that can benefit greatly from the use of grid resources is that of drug discovery since molecular docking simulations are an integral part of the discovery process. In this paper, we present a scalable, reusable platform to choreograph large virtual screening experiments over a computational grid using the molecular docking simulation software DOCK. Software components are applied on multiple levels to create automated workflows consisting of input data delivery, job scheduling, status query, and collection of output to be displayed in a manageable fashion for further analysis. This was achieved using Opal OP to wrap the DOCK application as a grid service and PERL for data manipulation purposes, alleviating the requirement for extensive knowledge of grid infrastructure. With the platform in place, a screening of the ZINC 2,066,906 compound "drug-like" subset database against an enzyme's catalytic site was successfully performed using the MPI version of DOCK 5.4 on the PRAGMA grid testbed. The screening required 11.56 days laboratory time and utilized 200 processors over 7 clusters.
网格计算提供了一种强大的替代方案,可在全球范围内跨不同机构共享资源,以运行计算密集型科学应用程序,而无需集中式超级计算机。人们已投入大量精力开发软件,以便在基于网格的基础设施上部署遗留应用程序,并有效利用可用资源。药物发现领域可从网格资源的使用中大大受益,因为分子对接模拟是发现过程不可或缺的一部分。在本文中,我们展示了一个可扩展、可重复使用的平台,用于使用分子对接模拟软件DOCK在计算网格上编排大型虚拟筛选实验。软件组件应用于多个层面,以创建自动化工作流程,包括输入数据传递、作业调度、状态查询以及输出收集,以便以可管理的方式显示以供进一步分析。这是通过使用Opal OP将DOCK应用程序包装为网格服务,并使用PERL进行数据操作来实现的,从而减轻了对网格基础设施广泛知识的需求。该平台搭建完成后,在PRAGMA网格测试平台上使用DOCK 5.4的MPI版本成功对ZINC 2,066,906化合物“类药物”子集数据库针对一种酶的催化位点进行了筛选。该筛选需要11.56天的实验室时间,并在7个集群上使用了200个处理器。