Kutzner Carsten, Miletić Vedran, Palacio Rodríguez Karen, Rampp Markus, Hummer Gerhard, de Groot Bert L, Grubmüller Helmut
Theoretical and Computational Biophysics, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany.
Max Planck Computing and Data Facility, Garching, Germany.
J Comput Chem. 2025 Feb 15;46(5):e70059. doi: 10.1002/jcc.70059.
We benchmarked the performance of the GROMACS 2024 molecular dynamics (MD) code on a modern high-performance computing (HPC) cluster with AMD CPUs on up to 65,536 CPU cores. We used five different MD systems, ranging in size from about 82,000 to 204 million atoms, and evaluated their performance using two different Message Passing Interface (MPI) libraries, Intel-MPI and Open-MPI. The largest system showed near-perfect strong scaling up to 512 nodes or 65,536 cores, maintaining a parallel efficiency above 0.9 even at the highest level of parallelization. Energy efficiency for a given number of nodes was generally equal to or slightly better than parallel efficiency. We achieved peak performances of 687 ns/d for the 82k atom system, 116 ns/d for the 53M atom system, and about 35 ns/d for the largest 204M atom system. These results demonstrate that highly optimized software running on a state-of-the-art HPC cluster provides sufficient computing power to simulate biomolecular systems at the mesoscale of viruses and organelles, and potentially small cells in the near future.
我们在配备AMD CPU的现代高性能计算(HPC)集群上,对多达65,536个CPU核心的GROMACS 2024分子动力学(MD)代码性能进行了基准测试。我们使用了五个不同的MD系统,原子数量从约82,000个到2.04亿个不等,并使用两种不同的消息传递接口(MPI)库Intel-MPI和Open-MPI评估了它们的性能。最大的系统在扩展到512个节点或65,536个核心时显示出近乎完美的强扩展性,即使在最高并行化级别,并行效率仍保持在0.9以上。对于给定数量的节点,能量效率通常等于或略优于并行效率。对于82k原子系统,我们实现了687 ns/d的峰值性能;对于53M原子系统,为116 ns/d;对于最大的204M原子系统,约为35 ns/d。这些结果表明,在最先进的HPC集群上运行的高度优化软件,提供了足够的计算能力来模拟病毒和细胞器中尺度的生物分子系统,并且在不久的将来可能模拟小型细胞。