Vasconcellos Eduardo C, Clua Esteban W G, Fenton Flavio H, Zamith Marcelo
Institute of Computing, Fluminense Federal University, Niterói, Brazil.
School of Physics, Georgia Institute of Technology, Atlanta, Georgia.
Concurr Comput. 2020 Mar 10;32(5). doi: 10.1002/cpe.5528. Epub 2019 Oct 23.
Simulations of cardiac electrophysiological models in tissue, particularly in 3D require the solutions of billions of differential equations even for just a couple of milliseconds, thus highly demanding in computational resources. In fact, even studies in small domains with very complex models may take several hours to reproduce seconds of electrical cardiac behavior. Today's Graphics Processor Units (GPUs) are becoming a way to accelerate such simulations, and give the added possibilities to run them locally without the need for supercomputers. Nevertheless, when using GPUs, bottlenecks related to global memory access caused by the spatial discretization of the large tissue domains being simulated, become a big challenge. For simulations in a single GPU, we propose a strategy to accelerate the computation of the diffusion term through a data-structure and memory access pattern designed to maximize coalescent memory transactions and minimize branch divergence, achieving results approximately 1.4 times faster than a standard GPU method. We also combine this data structure with a designed communication strategy to take advantage in the case of simulations in multi-GPU platforms. We demonstrate that, in the multi-GPU approach performs, simulations in 3D tissue can be just 4× slower than real time.
对组织中的心脏电生理模型进行模拟,尤其是三维模拟,即使仅模拟几毫秒,也需要求解数十亿个微分方程,因此对计算资源要求极高。事实上,即使是在小区域内使用非常复杂的模型进行研究,也可能需要数小时才能重现心脏电活动的几秒钟行为。如今的图形处理器(GPU)正成为加速此类模拟的一种方式,并提供了无需超级计算机即可在本地运行模拟的额外可能性。然而,在使用GPU时,由于所模拟的大组织区域的空间离散化导致的与全局内存访问相关的瓶颈,成为了一个巨大挑战。对于在单个GPU上进行的模拟,我们提出了一种策略,通过设计一种数据结构和内存访问模式来加速扩散项的计算,该模式旨在最大化合并内存事务并最小化分支发散,从而实现比标准GPU方法快约1.4倍的结果。我们还将这种数据结构与一种设计好的通信策略相结合,以便在多GPU平台的模拟中发挥优势。我们证明,在多GPU方法中,三维组织模拟的速度仅比实时速度慢4倍。