Phillips James C, Sun Yanhua, Jain Nikhil, Bohm Eric J, Kalé Laxmikant V
SC Conf Proc. 2014;2014:81-91. doi: 10.1109/SC.2014.12.
Currently deployed petascale supercomputers typically use toroidal network topologies in three or more dimensions. While these networks perform well for topology-agnostic codes on a few thousand nodes, leadership machines with 20,000 nodes require topology awareness to avoid network contention for communication-intensive codes. Topology adaptation is complicated by irregular node allocation shapes and holes due to dedicated input/output nodes or hardware failure. In the context of the popular molecular dynamics program NAMD, we present methods for mapping a periodic 3-D grid of fixed-size spatial decomposition domains to 3-D Cray Gemini and 5-D IBM Blue Gene/Q toroidal networks to enable hundred-million atom full machine simulations, and to similarly partition node allocations into compact domains for smaller simulations using multiple-copy algorithms. Additional enabling techniques are discussed and performance is reported for NCSA Blue Waters, ORNL Titan, ANL Mira, TACC Stampede, and NERSC Edison.
目前部署的千万亿次级超级计算机通常在三维或更多维度上使用环形网络拓扑结构。虽然这些网络在数千个节点上运行与拓扑结构无关的代码时表现良好,但拥有20000个节点的领先级计算机需要拓扑感知能力,以避免通信密集型代码出现网络争用。由于专用输入/输出节点或硬件故障导致的不规则节点分配形状和空洞,使得拓扑适配变得复杂。在流行的分子动力学程序NAMD的背景下,我们提出了一些方法,用于将固定大小空间分解域的周期性三维网格映射到三维Cray Gemini和五维IBM Blue Gene/Q环形网络,以实现数亿原子的全机模拟,并使用多副本算法将节点分配类似地划分为紧凑域以进行较小规模的模拟。还讨论了其他支持技术,并报告了NCSA Blue Waters、ORNL Titan、ANL Mira、TACC Stampede和NERSC Edison的性能。