利用现代计算加速器快速加速二维波传播模拟。

Fast acceleration of 2D wave propagation simulations using modern computational accelerators.

作者信息

Wang Wei, Xu Lifan, Cavazos John, Huang Howie H, Kay Matthew

机构信息

Computer and Information Sciences Department, University of Delaware, Newark, Delaware, United States of America.

Electrical and Computer Engineering Department, George Washington University, Washington, DC, United States of America.

出版信息

PLoS One. 2014 Jan 30;9(1):e86484. doi: 10.1371/journal.pone.0086484. eCollection 2014.

DOI:10.1371/journal.pone.0086484

PMID:24497950

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3907428/

Abstract

Recent developments in modern computational accelerators like Graphics Processing Units (GPUs) and coprocessors provide great opportunities for making scientific applications run faster than ever before. However, efficient parallelization of scientific code using new programming tools like CUDA requires a high level of expertise that is not available to many scientists. This, plus the fact that parallelized code is usually not portable to different architectures, creates major challenges for exploiting the full capabilities of modern computational accelerators. In this work, we sought to overcome these challenges by studying how to achieve both automated parallelization using OpenACC and enhanced portability using OpenCL. We applied our parallelization schemes using GPUs as well as Intel Many Integrated Core (MIC) coprocessor to reduce the run time of wave propagation simulations. We used a well-established 2D cardiac action potential model as a specific case-study. To the best of our knowledge, we are the first to study auto-parallelization of 2D cardiac wave propagation simulations using OpenACC. Our results identify several approaches that provide substantial speedups. The OpenACC-generated GPU code achieved more than 150x speedup above the sequential implementation and required the addition of only a few OpenACC pragmas to the code. An OpenCL implementation provided speedups on GPUs of at least 200x faster than the sequential implementation and 30x faster than a parallelized OpenMP implementation. An implementation of OpenMP on Intel MIC coprocessor provided speedups of 120x with only a few code changes to the sequential implementation. We highlight that OpenACC provides an automatic, efficient, and portable approach to achieve parallelization of 2D cardiac wave simulations on GPUs. Our approach of using OpenACC, OpenCL, and OpenMP to parallelize this particular model on modern computational accelerators should be applicable to other computational models of wave propagation in multi-dimensional media.

摘要

诸如图形处理单元（GPU）和协处理器等现代计算加速器的最新发展，为使科学应用程序运行得比以往任何时候都更快提供了巨大机遇。然而，使用CUDA等新编程工具对科学代码进行高效并行化需要高水平的专业知识，而许多科学家并不具备这种知识。再加上并行化代码通常无法移植到不同架构，这给充分利用现代计算加速器的全部功能带来了重大挑战。在这项工作中，我们试图通过研究如何使用OpenACC实现自动并行化以及使用OpenCL提高可移植性来克服这些挑战。我们将并行化方案应用于GPU以及英特尔众核（MIC）协处理器，以减少波传播模拟的运行时间。我们使用一个成熟的二维心脏动作电位模型作为具体案例研究。据我们所知，我们是首个研究使用OpenACC对二维心脏波传播模拟进行自动并行化的。我们的结果确定了几种能带来显著加速的方法。OpenACC生成的GPU代码比顺序实现快了150倍以上，并且只需在代码中添加少量OpenACC编译指示。OpenCL实现使GPU上的加速比顺序实现至少快200倍，比并行化的OpenMP实现快30倍。在英特尔MIC协处理器上的OpenMP实现只需对顺序实现进行少量代码更改，就能实现120倍的加速。我们强调，OpenACC为在GPU上实现二维心脏波模拟的并行化提供了一种自动、高效且可移植的方法。我们在现代计算加速器上使用OpenACC、OpenCL和OpenMP对这个特定模型进行并行化的方法，应该适用于多维介质中波传播的其他计算模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0602/3907428/ec1d45744b07/pone.0086484.g001.jpg

相似文献

Fast acceleration of 2D wave propagation simulations using modern computational accelerators.

PLoS One. 2014 Jan 30;9(1):e86484. doi: 10.1371/journal.pone.0086484. eCollection 2014.

A heterogeneous computing accelerated SCE-UA global optimization method using OpenMP, OpenCL, CUDA, and OpenACC.

Water Sci Technol. 2017 Oct;76(7-8):1640-1651. doi: 10.2166/wst.2017.322.

CUDA-BLASTP: accelerating BLASTP on CUDA-enabled graphics hardware.

IEEE/ACM Trans Comput Biol Bioinform. 2011 Nov-Dec;8(6):1678-84. doi: 10.1109/TCBB.2011.33.

Massive exploration of perturbed conditions of the blood coagulation cascade through GPU parallelization.

Biomed Res Int. 2014;2014:863298. doi: 10.1155/2014/863298. Epub 2014 Jun 16.

Comparison of GPU- and CPU-implementations of mean-firing rate neural networks on parallel hardware.

Network. 2012;23(4):212-36. doi: 10.3109/0954898X.2012.739292. Epub 2012 Nov 9.

Parallelized computation for computer simulation of electrocardiograms using personal computers with multi-core CPU and general-purpose GPU.

Comput Methods Programs Biomed. 2010 Oct;100(1):87-96. doi: 10.1016/j.cmpb.2010.06.015. Epub 2010 Jul 31.

NMF-mGPU: non-negative matrix factorization on multi-GPU systems.

BMC Bioinformatics. 2015 Feb 13;16:43. doi: 10.1186/s12859-015-0485-4.

LASSIE: simulating large-scale models of biochemical systems on GPUs.

BMC Bioinformatics. 2017 May 10;18(1):246. doi: 10.1186/s12859-017-1666-0.

A combined MPI-CUDA parallel solution of linear and nonlinear Poisson-Boltzmann equation.

Biomed Res Int. 2014;2014:560987. doi: 10.1155/2014/560987. Epub 2014 Jun 12.

ParaCells: A GPU Architecture for Cell-Centered Models in Computational Biology.

IEEE/ACM Trans Comput Biol Bioinform. 2019 May-Jun;16(3):994-1006. doi: 10.1109/TCBB.2018.2814570.

引用本文的文献

Fast interactive simulations of cardiac electrical activity in anatomically accurate heart structures by compressing sparse uniform cartesian grids.

Comput Methods Programs Biomed. 2024 Dec;257:108456. doi: 10.1016/j.cmpb.2024.108456. Epub 2024 Oct 24.

Large-scale Interactive Numerical Experiments of Chaos, Solitons and Fractals in Real Time via GPU in a Web Browser.

Chaos Solitons Fractals. 2019 Apr;121:6-29. doi: 10.1016/j.chaos.2019.01.005. Epub 2019 Feb 16.

Real-time interactive simulations of large-scale systems on personal computers and cell phones: Toward patient-specific heart modeling and other applications.

Sci Adv. 2019 Mar 27;5(3):eaav6019. doi: 10.1126/sciadv.aav6019. eCollection 2019 Mar.

本文引用的文献

Interstitial fluid flow and drug delivery in vascularized tumors: a computational model.

PLoS One. 2013 Aug 5;8(8):e70395. doi: 10.1371/journal.pone.0070395. Print 2013.

Towards real-time simulation of cardiac electrophysiology in a human heart at high resolution.

Comput Methods Biomech Biomed Engin. 2013;16(7):802-5. doi: 10.1080/10255842.2013.795556. Epub 2013 Jun 4.

Accelerating cardiac bidomain simulations using graphics processing units.

IEEE Trans Biomed Eng. 2012 Aug;59(8):2281-90. doi: 10.1109/TBME.2012.2202661. Epub 2012 Jun 5.

GPGPU accelerated cardiac arrhythmia simulations.

Annu Int Conf IEEE Eng Med Biol Soc. 2011;2011:724-7. doi: 10.1109/IEMBS.2011.6090164.

Petascale computation performance of lightweight multiscale cardiac models using hybrid programming models.

Annu Int Conf IEEE Eng Med Biol Soc. 2011;2011:433-6. doi: 10.1109/IEMBS.2011.6090058.

Performance of hybrid programming models for multiscale cardiac simulations: preparing for petascale computation.

IEEE Trans Biomed Eng. 2011 Oct;58(10):2965-9. doi: 10.1109/TBME.2011.2161580. Epub 2011 Jul 14.

Simulating human cardiac electrophysiology on clinical time-scales.

Front Physiol. 2011 Apr 9;2:14. doi: 10.3389/fphys.2011.00014. eCollection 2011.

Implications of the Turing completeness of reaction-diffusion models, informed by GPGPU simulations on an XBox 360: cardiac arrhythmias, re-entry and the Halting problem.

Comput Biol Chem. 2009 Aug;33(4):253-60. doi: 10.1016/j.compbiolchem.2009.05.001. Epub 2009 Jun 11.

Interaction between spiral and paced waves in cardiac tissue.

Am J Physiol Heart Circ Physiol. 2007 Jul;293(1):H503-13. doi: 10.1152/ajpheart.01060.2006. Epub 2007 Mar 23.

Measuring curvature and velocity vector fields for waves of cardiac excitation in 2-D media.

IEEE Trans Biomed Eng. 2005 Jan;52(1):50-63. doi: 10.1109/TBME.2004.839798.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用现代计算加速器快速加速二维波传播模拟。

Fast acceleration of 2D wave propagation simulations using modern computational accelerators.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献