Zhou Ting, Caflisch Amedeo
Department of Biochemistry, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland.
J Chem Inf Model. 2009 Jan;49(1):145-52. doi: 10.1021/ci800295q.
High throughput docking (HTD) using high performance computing platforms is a multidisciplinary challenge. To handle HTD data effectively and efficiently, we have developed a distributed virtual screening data management system (DVSDMS) in which the data handling and the distribution of jobs are realized by the open-source structured query language database software MySQL. The essential concept of DVSDMS is the separation of the data management from the docking and ranking applications. DVSDMS can be used to dock millions of molecules effectively, monitor the process in real time, analyze docking results promptly, and process up to 10(8) poses by energy ranking techniques. In an HTD campaign to identify kinase inhibitors a low cost Linux PC has allowed DVSDMS to efficiently assign the workload to more than 500 computing clients. Notably, in a stress test of DVSDMS that emulated a large number of clients, about 60 molecules per second were distributed to the clients for docking, which indicates that DVSDMS can run efficiently on very large compute cluster (up to about 40000 cores).
使用高性能计算平台进行高通量对接(HTD)是一项多学科挑战。为了有效且高效地处理HTD数据,我们开发了一种分布式虚拟筛选数据管理系统(DVSDMS),其中数据处理和作业分配通过开源结构化查询语言数据库软件MySQL来实现。DVSDMS的核心概念是将数据管理与对接和排序应用程序分离。DVSDMS可用于有效地对接数百万个分子,实时监控过程,迅速分析对接结果,并通过能量排序技术处理多达10⁸个构象。在一项识别激酶抑制剂的HTD活动中,一台低成本的Linux个人电脑使DVSDMS能够有效地将工作负载分配给500多个计算客户端。值得注意的是,在一次模拟大量客户端的DVSDMS压力测试中,每秒约有60个分子被分发给客户端进行对接,这表明DVSDMS可以在非常大的计算集群(多达约40000个核心)上高效运行。