Fu Zhisong, Kirby Robert M, Whitaker Ross T
The Scientific Computing and Imaging Institute, University of Utah, Salt Lake City, UT 84112.
SIAM J Sci Comput. 2013;35(5):c473-c494. doi: 10.1137/120881956.
Generating numerical solutions to the eikonal equation and its many variations has a broad range of applications in both the natural and computational sciences. Efficient solvers on cutting-edge, parallel architectures require new algorithms that may not be theoretically , but that are designed to allow asynchronous solution updates and have limited memory access patterns. This paper presents a parallel algorithm for solving the eikonal equation on fully unstructured tetrahedral meshes. The method is appropriate for the type of fine-grained parallelism found on modern massively-SIMD architectures such as graphics processors and takes into account the particular constraints and capabilities of these computing platforms. This work builds on previous work for solving these equations on triangle meshes; in this paper we adapt and extend previous two-dimensional strategies to accommodate three-dimensional, unstructured, tetrahedralized domains. These new developments include a local update strategy with data compaction for tetrahedral meshes that provides solutions on both serial and parallel architectures, with a generalization to inhomogeneous, anisotropic speed functions. We also propose two new update schemes, specialized to mitigate the natural data increase observed when moving to three dimensions, and the data structures necessary for efficiently mapping data to parallel SIMD processors in a way that maintains . Finally, we present descriptions of the implementations for a single CPU, as well as multicore CPUs with shared memory and SIMD architectures, with comparative results against state-of-the-art eikonal solvers.
为程函方程及其多种变体生成数值解在自然科学和计算科学中都有广泛的应用。在前沿的并行架构上实现高效求解器需要新的算法,这些算法可能在理论上并非最优,但旨在允许异步解更新且具有有限的内存访问模式。本文提出了一种在完全非结构化四面体网格上求解程函方程的并行算法。该方法适用于在现代大规模SIMD架构(如图形处理器)上发现的细粒度并行类型,并考虑了这些计算平台的特殊约束和能力。这项工作建立在先前在三角形网格上求解这些方程的工作基础之上;在本文中,我们对先前的二维策略进行了调整和扩展,以适应三维非结构化四面体化区域。这些新进展包括一种针对四面体网格的数据压缩局部更新策略,该策略可在串行和并行架构上提供解,并推广到非均匀、各向异性速度函数。我们还提出了两种新的更新方案,专门用于减轻向三维转换时自然出现的数据增加问题,以及以保持某种状态的方式将数据有效映射到并行SIMD处理器所需的数据结构。最后,我们给出了在单个CPU以及具有共享内存和SIMD架构的多核CPU上的实现描述,并与最先进的程函求解器进行了对比结果。