Suppr超能文献

通过蓝色基因/P超级计算机上的多发送实现具有非阻塞邻域集合通信的应用程序优化

Optimization of Applications with Non-blocking Neighborhood Collectives via Multisends on the Blue Gene/P Supercomputer.

作者信息

Kumar Sameer, Heidelberger Philip, Chen Dong, Hines Michael

机构信息

IBM T.J. Watson Research Center Yorktown Heights, NY, 10598.

出版信息

Proc IPDPS (Conf). 2010 Apr 19;2010:1-11. doi: 10.1109/IPDPS.2010.5470407.

Abstract

We explore the multisend interface as a data mover interface to optimize applications with neighborhood collective communication operations. One of the limitations of the current MPI 2.1 standard is that the vector collective calls require counts and displacements (zero and nonzero bytes) to be specified for all the processors in the communicator. Further, all the collective calls in MPI 2.1 are blocking and do not permit overlap of communication with computation. We present the record replay persistent optimization to the multisend interface that minimizes the processor overhead of initiating the collective. We present four different case studies with the multisend API on Blue Gene/P (i) 3D-FFT, (ii) 4D nearest neighbor exchange as used in Quantum Chromodynamics, (iii) NAMD and (iv) neural network simulator NEURON. Performance results show 1.9× speedup with 32(3) 3D-FFTs, 1.9× speedup for 4D nearest neighbor exchange with the 2(4) problem, 1.6× speedup in NAMD and almost 3× speedup in NEURON with 256K cells and 1k connections/cell.

摘要

我们将多发送接口作为一种数据移动器接口进行探索,以通过邻域集合通信操作来优化应用程序。当前MPI 2.1标准的局限性之一在于,向量集合调用要求为通信器中的所有处理器指定计数和位移(零字节和非零字节)。此外,MPI 2.1中的所有集合调用都是阻塞式的,不允许通信与计算重叠。我们提出了对多发送接口的记录重放持久优化,以最小化发起集合操作时的处理器开销。我们展示了在蓝色基因/P上使用多发送API的四个不同案例研究:(i)三维快速傅里叶变换,(ii)量子色动力学中使用的四维最近邻交换,(iii)纳米分子动力学模拟程序,以及(iv)神经网络模拟器NEURON。性能结果表明,对于32(3)个三维快速傅里叶变换,加速比为1.9倍;对于具有2(4)问题的四维最近邻交换,加速比为1.9倍;在纳米分子动力学模拟程序中加速比为1.6倍;在具有256K个细胞和每个细胞1k个连接的NEURON中加速比近3倍。

相似文献

2
Comparison of neuronal spike exchange methods on a Blue Gene/P supercomputer.
Front Comput Neurosci. 2011 Nov 18;5:49. doi: 10.3389/fncom.2011.00049. eCollection 2011.
3
Parallel network simulations with NEURON.
J Comput Neurosci. 2006 Oct;21(2):119-29. doi: 10.1007/s10827-006-7949-5. Epub 2006 May 26.
4
Optimizing NEURON Simulation Environment Using Remote Memory Access with Recursive Doubling on Distributed Memory Systems.
Comput Intell Neurosci. 2016;2016:3676582. doi: 10.1155/2016/3676582. Epub 2016 Jun 20.
6
Demonstration of Algorithmic Quantum Speedup.
Phys Rev Lett. 2023 May 26;130(21):210602. doi: 10.1103/PhysRevLett.130.210602.
7
Parallel sequential minimal optimization for the training of support vector machines.
IEEE Trans Neural Netw. 2006 Jul;17(4):1039-49. doi: 10.1109/TNN.2006.875989.
8
Superposition on a multicomputer system.
Med Phys. 1991 May-Jun;18(3):468-73. doi: 10.1118/1.596650.
10
A quantum circuit simulator and its applications on Sunway TaihuLight supercomputer.
Sci Rep. 2021 Jan 11;11(1):355. doi: 10.1038/s41598-020-79777-y.

引用本文的文献

1
Fast Simulation of a Multi-Area Spiking Network Model of Macaque Cortex on an MPI-GPU Cluster.
Front Neuroinform. 2022 Jul 4;16:883333. doi: 10.3389/fninf.2022.883333. eCollection 2022.
2
CoreNEURON : An Optimized Compute Engine for the NEURON Simulator.
Front Neuroinform. 2019 Sep 19;13:63. doi: 10.3389/fninf.2019.00063. eCollection 2019.
3
Asynchronous Branch-Parallel Simulation of Detailed Neuron Models.
Front Neuroinform. 2019 Jul 23;13:54. doi: 10.3389/fninf.2019.00054. eCollection 2019.
4
Communication Sparsity in Distributed Spiking Neural Network Simulations to Improve Scalability.
Front Neuroinform. 2019 Apr 2;13:19. doi: 10.3389/fninf.2019.00019. eCollection 2019.
5
Integration of Continuous-Time Dynamics in a Spiking Neural Network Simulator.
Front Neuroinform. 2017 May 24;11:34. doi: 10.3389/fninf.2017.00034. eCollection 2017.
6
Early experiences in developing and managing the neuroscience gateway.
Concurr Comput. 2015 Feb 1;27(2):473-488. doi: 10.1002/cpe.3283.
7
Supercomputers ready for use as discovery machines for neuroscience.
Front Neuroinform. 2012 Nov 2;6:26. doi: 10.3389/fninf.2012.00026. eCollection 2012.
8
Comparison of neuronal spike exchange methods on a Blue Gene/P supercomputer.
Front Comput Neurosci. 2011 Nov 18;5:49. doi: 10.3389/fncom.2011.00049. eCollection 2011.
9
An Ultrascalable Solution to Large-scale Neural Tissue Simulation.
Front Neuroinform. 2011 Sep 19;5:15. doi: 10.3389/fninf.2011.00015. eCollection 2011.

本文引用的文献

1
Parallel network simulations with NEURON.
J Comput Neurosci. 2006 Oct;21(2):119-29. doi: 10.1007/s10827-006-7949-5. Epub 2006 May 26.
2
Advancing the boundaries of high-connectivity network simulation with distributed computing.
Neural Comput. 2005 Aug;17(8):1776-801. doi: 10.1162/0899766054026648.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验