Acun B, Hardy D J, Kale L V, Li K, Phillips J C, Stone J E
IBM Research Division, IBM T. J. Watson Research Center, Yorktown Heights, NY, 10598, USA (
Theoretical and Computational Biophysics Group, Beckman Institute, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA (
IBM J Res Dev. 2018 Nov-Dec;62(6):1-9. doi: 10.1147/jrd.2018.2888986. Epub 2018 Dec 21.
NAMD (NAnoscale Molecular Dynamics) is a parallel molecular dynamics application that has been used to make breakthroughs in understanding the structure and dynamics of large biomolecular complexes, such as viruses like HIV and various types of influenza. State-of-the-art biomolecular simulations often require integration of billions of timesteps, computing all interatomic forces for each femtosecond timestep. Molecular dynamics simulation of large biomolecular systems and long-timescale biological phenomena requires tremendous computing power. NAMD harnesses the power of thousands of heterogeneous processors to meet this demand. In this paper, we present algorithm improvements and performance optimizations that enable NAMD to achieve high performance on the IBM Newell platform (with POWER9 processors and NVIDIA Volta V100 GPUs) which underpins the Oak Ridge National Laboratory's Summit and Lawrence Livermore National Laboratory's Sierra supercomputers. The Top-500 supercomputers June 2018 list shows Summit at the number one spot with 187 Petaflop/s peak performance and Sierra third with 119 Petaflop/s. Optimizations for NAMD on Summit include: data layout changes for GPU acceleration and CPU vectorization, improving GPU offload efficiency, increasing performance with PAMI support in Charm++, improving efficiency of FFT calculations, improving load balancing, enabling better CPU vectorization and cache performance, and providing an alternative thermostat through stochastic velocity rescaling. We also present performance scaling results on early Newell systems.
NAMD(纳米尺度分子动力学)是一款并行分子动力学应用程序,已被用于在理解大型生物分子复合物的结构和动力学方面取得突破,比如像HIV这样的病毒以及各类流感病毒。最先进的生物分子模拟通常需要整合数十亿个时间步长,在每个飞秒时间步长内计算所有原子间的力。对大型生物分子系统和长时间尺度生物现象进行分子动力学模拟需要巨大的计算能力。NAMD利用数千个异构处理器的能力来满足这一需求。在本文中,我们展示了算法改进和性能优化,这些改进和优化使NAMD能够在IBM纽厄尔平台(配备POWER9处理器和NVIDIA Volta V100 GPU)上实现高性能,该平台是橡树岭国家实验室的Summit和劳伦斯利弗莫尔国家实验室的Sierra超级计算机的基础。2018年6月的全球超级计算机500强榜单显示,Summit以187 petaflop/s的峰值性能位居榜首,Sierra以119 petaflop/s位居第三。针对Summit对NAMD的优化包括:为GPU加速和CPU矢量化改变数据布局、提高GPU卸载效率、通过Charm++中的PAMI支持提高性能、提高FFT计算效率、改善负载平衡、实现更好的CPU矢量化和缓存性能,以及通过随机速度重缩放提供一种替代的恒温器。我们还展示了早期纽厄尔系统上的性能扩展结果。