消息传递接口和多线程混合技术在大规模高性能计算机构架下并行对接大型分子数据库。

Message passing interface and multithreading hybrid for parallel molecular docking of large databases on petascale high performance computing machines.

机构信息

Biosciences & Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Lab, Livermore, CA 94550, USA.

出版信息

J Comput Chem. 2013 Apr 30;34(11):915-27. doi: 10.1002/jcc.23214. Epub 2013 Jan 23.

DOI:10.1002/jcc.23214

PMID:23345155

Abstract

A mixed parallel scheme that combines message passing interface (MPI) and multithreading was implemented in the AutoDock Vina molecular docking program. The resulting program, named VinaLC, was tested on the petascale high performance computing (HPC) machines at Lawrence Livermore National Laboratory. To exploit the typical cluster-type supercomputers, thousands of docking calculations were dispatched by the master process to run simultaneously on thousands of slave processes, where each docking calculation takes one slave process on one node, and within the node each docking calculation runs via multithreading on multiple CPU cores and shared memory. Input and output of the program and the data handling within the program were carefully designed to deal with large databases and ultimately achieve HPC on a large number of CPU cores. Parallel performance analysis of the VinaLC program shows that the code scales up to more than 15K CPUs with a very low overhead cost of 3.94%. One million flexible compound docking calculations took only 1.4 h to finish on about 15K CPUs. The docking accuracy of VinaLC has been validated against the DUD data set by the re-docking of X-ray ligands and an enrichment study, 64.4% of the top scoring poses have RMSD values under 2.0 Å. The program has been demonstrated to have good enrichment performance on 70% of the targets in the DUD data set. An analysis of the enrichment factors calculated at various percentages of the screening database indicates VinaLC has very good early recovery of actives.

摘要

一种混合并行方案，结合了消息传递接口 (MPI) 和多线程，被应用于 AutoDock Vina 分子对接程序中。由此产生的程序，被命名为 VinaLC，在劳伦斯利弗莫尔国家实验室的千万亿次级高性能计算 (HPC) 机器上进行了测试。为了利用典型的集群式超级计算机，成千上万的对接计算由主进程分配，在成千上万的从进程上同时运行，每个对接计算在一个节点上使用一个从进程，并且在节点内，每个对接计算通过多线程在多个 CPU 内核和共享内存上运行。程序的输入和输出以及程序内的数据处理都经过精心设计，以处理大型数据库，并最终在大量 CPU 内核上实现 HPC。VinaLC 程序的并行性能分析表明，该代码可扩展到超过 15K 个 CPU，开销非常低，仅为 3.94%。在大约 15K 个 CPU 上，完成 100 万个柔性化合物对接计算仅需 1.4 小时。VinaLC 的对接准确性已通过 X 射线配体的重新对接和富集研究得到 DUD 数据集的验证，64.4%的得分最高构象的 RMSD 值小于 2.0Å。该程序在 DUD 数据集中 70%的靶标上表现出良好的富集性能。对在不同筛选数据库百分比计算得出的富集因子的分析表明，VinaLC 具有非常好的早期活性回收能力。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

消息传递接口和多线程混合技术在大规模高性能计算机构架下并行对接大型分子数据库。

Message passing interface and multithreading hybrid for parallel molecular docking of large databases on petascale high performance computing machines.

机构信息

出版信息

相似文献

引用本文的文献

消息传递接口和多线程混合技术在大规模高性能计算机构架下并行对接大型分子数据库。

Message passing interface and multithreading hybrid for parallel molecular docking of large databases on petascale high performance computing machines.

机构信息

出版信息

相似文献

引用本文的文献