使用高性能超级计算机对非常大的化合物数据库进行对接的 Autodock4 的任务并行消息传递接口实现。

Task-parallel message passing interface implementation of Autodock4 for docking of very large databases of compounds using high-performance super-computers.

机构信息

Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville, Tennessee 37996, USA.

出版信息

J Comput Chem. 2011 Apr 30;32(6):1202-9. doi: 10.1002/jcc.21696. Epub 2010 Dec 7.

DOI:10.1002/jcc.21696

PMID:21387347

Abstract

A message passing interface (MPI)-based implementation (Autodock4.lga.MPI) of the grid-based docking program Autodock4 has been developed to allow simultaneous and independent docking of multiple compounds on up to thousands of central processing units (CPUs) using the Lamarkian genetic algorithm. The MPI version reads a single binary file containing precalculated grids that represent the protein-ligand interactions, i.e., van der Waals, electrostatic, and desolvation potentials, and needs only two input parameter files for the entire docking run. In comparison, the serial version of Autodock4 reads ASCII grid files and requires one parameter file per compound. The modifications performed result in significantly reduced input/output activity compared with the serial version. Autodock4.lga.MPI scales up to 8192 CPUs with a maximal overhead of 16.3%, of which two thirds is due to input/output operations and one third originates from MPI operations. The optimal docking strategy, which minimizes docking CPU time without lowering the quality of the database enrichments, comprises the docking of ligands preordered from the most to the least flexible and the assignment of the number of energy evaluations as a function of the number of rotatable bounds. In 24 h, on 8192 high-performance computing CPUs, the present MPI version would allow docking to a rigid protein of about 300K small flexible compounds or 11 million rigid compounds.

摘要

已开发出基于消息传递接口 (MPI) 的网格对接程序 Autodock4 的实现 (Autodock4.lga.MPI)，以允许使用拉马克遗传算法同时独立对接多达数千个中央处理器 (CPU) 上的多个化合物。MPI 版本读取一个包含预先计算的网格的单个二进制文件，这些网格表示蛋白质-配体相互作用，即范德华、静电和去溶剂化势，并且整个对接运行仅需要两个输入参数文件。相比之下，Autodock4 的串行版本读取 ASCII 网格文件，并且每个化合物需要一个参数文件。与串行版本相比，所执行的修改导致输入/输出活动大大减少。Autodock4.lga.MPI 可扩展到 8192 个 CPU，最大开销为 16.3%，其中三分之二是由于输入/输出操作，三分之一来自 MPI 操作。最佳对接策略是在不降低数据库富集质量的情况下最小化对接 CPU 时间，包括根据可旋转键的数量将配体从前到后预排序为最灵活和最不灵活，并将能量评估的数量分配为函数。在 24 小时内，在 8192 个高性能计算 CPU 上，当前的 MPI 版本将允许对接约 30 万个小柔性化合物或 1100 万个刚性化合物的刚性蛋白质。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

使用高性能超级计算机对非常大的化合物数据库进行对接的 Autodock4 的任务并行消息传递接口实现。

Task-parallel message passing interface implementation of Autodock4 for docking of very large databases of compounds using high-performance super-computers.

机构信息

出版信息

相似文献

引用本文的文献

使用高性能超级计算机对非常大的化合物数据库进行对接的 Autodock4 的任务并行消息传递接口实现。

Task-parallel message passing interface implementation of Autodock4 for docking of very large databases of compounds using high-performance super-computers.

机构信息

出版信息

相似文献

引用本文的文献