作为并行算法的分子模拟工作流：哥白尼分布式高性能计算平台的执行引擎。

Molecular simulation workflows as parallel algorithms: the execution engine of Copernicus, a distributed high-performance computing platform.

作者信息

Pronk Sander, Pouya Iman, Lundborg Magnus, Rotskoff Grant, Wesén Björn, Kasson Peter M, Lindahl Erik

机构信息

Swedish eScience Research Center, Department of Theoretical Physics, KTH Royal Institute of Technology , SE-100 44 Stockholm, Sweden.

Department of Biochemistry and Biophysics, Science for Life Laboratory, Stockholm University , SE-106 91 Stockholm, Sweden.

出版信息

J Chem Theory Comput. 2015 Jun 9;11(6):2600-8. doi: 10.1021/acs.jctc.5b00234.

DOI:10.1021/acs.jctc.5b00234

PMID:26575558

Abstract

Computational chemistry and other simulation fields are critically dependent on computing resources, but few problems scale efficiently to the hundreds of thousands of processors available in current supercomputers-particularly for molecular dynamics. This has turned into a bottleneck as new hardware generations primarily provide more processing units rather than making individual units much faster, which simulation applications are addressing by increasingly focusing on sampling with algorithms such as free-energy perturbation, Markov state modeling, metadynamics, or milestoning. All these rely on combining results from multiple simulations into a single observation. They are potentially powerful approaches that aim to predict experimental observables directly, but this comes at the expense of added complexity in selecting sampling strategies and keeping track of dozens to thousands of simulations and their dependencies. Here, we describe how the distributed execution framework Copernicus allows the expression of such algorithms in generic workflows: dataflow programs. Because dataflow algorithms explicitly state dependencies of each constituent part, algorithms only need to be described on conceptual level, after which the execution is maximally parallel. The fully automated execution facilitates the optimization of these algorithms with adaptive sampling, where undersampled regions are automatically detected and targeted without user intervention. We show how several such algorithms can be formulated for computational chemistry problems, and how they are executed efficiently with many loosely coupled simulations using either distributed or parallel resources with Copernicus.

摘要

计算化学和其他模拟领域严重依赖计算资源，但很少有问题能够有效地扩展到当前超级计算机中可用的数十万处理器，特别是对于分子动力学而言。随着新一代硬件主要提供更多的处理单元，而不是使单个单元运行得更快，这已成为一个瓶颈，模拟应用正通过越来越多地关注使用诸如自由能微扰、马尔可夫状态建模、元动力学或里程碑法等算法进行采样来解决这一问题。所有这些方法都依赖于将多个模拟的结果合并为一个单一的观测值。它们是潜在的强大方法，旨在直接预测实验可观测量，但这是以在选择采样策略以及跟踪数十到数千个模拟及其依赖性方面增加复杂性为代价的。在这里，我们描述了分布式执行框架哥白尼如何允许在通用工作流（即数据流程序）中表达此类算法。由于数据流算法明确说明了每个组成部分的依赖性，因此算法只需在概念层面进行描述，之后执行将实现最大程度的并行。全自动执行有助于通过自适应采样对这些算法进行优化，在这种情况下，欠采样区域会在无需用户干预的情况下自动被检测并作为目标。我们展示了如何针对计算化学问题制定几种这样的算法，以及如何使用哥白尼通过分布式或并行资源进行许多松散耦合的模拟来高效执行这些算法。

相似文献

Molecular simulation workflows as parallel algorithms: the execution engine of Copernicus, a distributed high-performance computing platform.作为并行算法的分子模拟工作流：哥白尼分布式高性能计算平台的执行引擎。

J Chem Theory Comput. 2015 Jun 9;11(6):2600-8. doi: 10.1021/acs.jctc.5b00234.

Extending molecular simulation time scales: Parallel in time integrations for high-level quantum chemistry and complex force representations.扩展分子模拟时间尺度：用于高级量子化学和复杂力表示的并行时间积分。

J Chem Phys. 2013 Aug 21;139(7):074114. doi: 10.1063/1.4818328.

Accelerating medical research using the swift workflow system.使用快速工作流程系统加速医学研究。

Stud Health Technol Inform. 2007;126:207-16.

Atomistic protein folding simulations on the submillisecond time scale using worldwide distributed computing.利用全球分布式计算在亚毫秒时间尺度上进行原子水平的蛋白质折叠模拟。

Biopolymers. 2003 Jan;68(1):91-109. doi: 10.1002/bip.10219.

Proceedings of the Second Workshop on Theory meets Industry (Erwin-Schrödinger-Institute (ESI), Vienna, Austria, 12-14 June 2007).第二届理论与产业研讨会会议录（2007年6月12日至14日，奥地利维也纳埃尔温·薛定谔研究所）

J Phys Condens Matter. 2008 Feb 13;20(6):060301. doi: 10.1088/0953-8984/20/06/060301. Epub 2008 Jan 24.

Optimization of tomographic reconstruction workflows on geographically distributed resources.在地理分布式资源上优化断层扫描重建工作流程。

J Synchrotron Radiat. 2016 Jul;23(Pt 4):997-1005. doi: 10.1107/S1600577516007980. Epub 2016 Jun 15.

The cloud and other new computational methods to improve molecular modelling.云计算及其他用于改进分子建模的新计算方法。

Expert Opin Drug Discov. 2014 Oct;9(10):1121-31. doi: 10.1517/17460441.2014.941800. Epub 2014 Aug 22.

Accelerating molecular modeling applications with graphics processors.利用图形处理器加速分子建模应用。

J Comput Chem. 2007 Dec;28(16):2618-40. doi: 10.1002/jcc.20829.

Grand challenges in biomedical computing.生物医学计算中的重大挑战。

Crit Rev Biomed Eng. 1992;20(1-2):1-24.

Development of hardware accelerator for molecular dynamics simulations: a computation board that calculates nonbonded interactions in cooperation with fast multipole method.用于分子动力学模拟的硬件加速器开发：一种与快速多极子方法协同计算非键相互作用的计算板。

J Comput Chem. 2003 Apr 15;24(5):582-92. doi: 10.1002/jcc.10193.

引用本文的文献

Differential toxicity and localization of arginine-rich dipeptide repeat proteins depend on de-clustering of positive charges.富含精氨酸的二肽重复蛋白的差异毒性和定位取决于正电荷的去簇集。

iScience. 2023 May 25;26(6):106957. doi: 10.1016/j.isci.2023.106957. eCollection 2023 Jun 16.

Drug Design in the Exascale Era: A Perspective from Massively Parallel QM/MM Simulations.在 Exascale 时代的药物设计：大规模并行QM/MM 模拟的视角。

J Chem Inf Model. 2023 Jun 26;63(12):3647-3658. doi: 10.1021/acs.jcim.3c00557. Epub 2023 Jun 15.

gmxapi: A GROMACS-native Python interface for molecular dynamics with ensemble and plugin support.gmxapi：一个具有集合物料和插件支持的、基于 GROMACS 的 Python 分子动力学接口。

PLoS Comput Biol. 2022 Feb 14;18(2):e1009835. doi: 10.1371/journal.pcbi.1009835. eCollection 2022 Feb.

Phase separation and toxicity of C9orf72 poly(PR) depends on alternate distribution of arginine.C9orf72 聚（PR）的相分离和毒性取决于精氨酸的交替分布。

J Cell Biol. 2021 Nov 1;220(11). doi: 10.1083/jcb.202103160. Epub 2021 Sep 9.

Computational methods for exploring protein conformations.计算方法探索蛋白质构象。

Biochem Soc Trans. 2020 Aug 28;48(4):1707-1724. doi: 10.1042/BST20200193.

BioExcel Building Blocks, a software library for interoperable biomolecular simulation workflows.BioExcel 构建模块，用于互操作生物分子模拟工作流程的软件库。

Sci Data. 2019 Sep 10;6(1):169. doi: 10.1038/s41597-019-0177-4.

Continuous Evaluation of Ligand Protein Predictions: A Weekly Community Challenge for Drug Docking.连续评估配体蛋白预测：药物对接的每周社区挑战。

Structure. 2019 Aug 6;27(8):1326-1335.e4. doi: 10.1016/j.str.2019.05.012. Epub 2019 Jun 27.

Adaptive ensemble simulations of biomolecules.生物分子的自适应集成模拟。

Curr Opin Struct Biol. 2018 Oct;52:87-94. doi: 10.1016/j.sbi.2018.09.005. Epub 2018 Sep 25.

Combining experimental and simulation data of molecular processes via augmented Markov models.通过增强马尔可夫模型结合分子过程的实验和模拟数据。

Proc Natl Acad Sci U S A. 2017 Aug 1;114(31):8265-8270. doi: 10.1073/pnas.1704803114. Epub 2017 Jul 17.

Molecular dynamics simulations of membrane proteins and their interactions: from nanoscale to mesoscale.膜蛋白及其相互作用的分子动力学模拟：从纳米尺度到介观尺度

Curr Opin Struct Biol. 2016 Oct;40:8-16. doi: 10.1016/j.sbi.2016.06.007. Epub 2016 Jun 21.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

作为并行算法的分子模拟工作流：哥白尼分布式高性能计算平台的执行引擎。

Molecular simulation workflows as parallel algorithms: the execution engine of Copernicus, a distributed high-performance computing platform.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献