Thayer School of Engineering, Dartmouth College, 8000 Cummings Hall, Hanover, NH 03755, USA.
Physiol Meas. 2012 Oct;33(10):1703-15. doi: 10.1088/0967-3334/33/10/1703. Epub 2012 Sep 26.
Image reconstruction in soft-field tomography is based on an inverse problem formulation, where a forward model is fitted to the data. In medical applications, where the anatomy presents complex shapes, it is common to use finite element models (FEMs) to represent the volume of interest and solve a partial differential equation that models the physics of the system. Over the last decade, there has been a shifting interest from 2D modeling to 3D modeling, as the underlying physics of most problems are 3D. Although the increased computational power of modern computers allows working with much larger FEM models, the computational time required to reconstruct 3D images on a fine 3D FEM model can be significant, on the order of hours. For example, in electrical impedance tomography (EIT) applications using a dense 3D FEM mesh with half a million elements, a single reconstruction iteration takes approximately 15-20 min with optimized routines running on a modern multi-core PC. It is desirable to accelerate image reconstruction to enable researchers to more easily and rapidly explore data and reconstruction parameters. Furthermore, providing high-speed reconstructions is essential for some promising clinical application of EIT. For 3D problems, 70% of the computing time is spent building the Jacobian matrix, and 25% of the time in forward solving. In this work, we focus on accelerating the Jacobian computation by using single and multiple GPUs. First, we discuss an optimized implementation on a modern multi-core PC architecture and show how computing time is bounded by the CPU-to-memory bandwidth; this factor limits the rate at which data can be fetched by the CPU. Gains associated with the use of multiple CPU cores are minimal, since data operands cannot be fetched fast enough to saturate the processing power of even a single CPU core. GPUs have much faster memory bandwidths compared to CPUs and better parallelism. We are able to obtain acceleration factors of 20 times on a single NVIDIA S1070 GPU, and of 50 times on four GPUs, bringing the Jacobian computing time for a fine 3D mesh from 12 min to 14 s. We regard this as an important step toward gaining interactive reconstruction times in 3D imaging, particularly when coupled in the future with acceleration of the forward problem. While we demonstrate results for EIT, these results apply to any soft-field imaging modality where the Jacobian matrix is computed with the adjoint method.
软组织层析成像中的图像重建基于反问题的公式,其中正向模型拟合数据。在医学应用中,由于解剖结构呈现复杂的形状,通常使用有限元模型(FEM)来表示感兴趣的体积,并求解模型系统物理特性的偏微分方程。在过去的十年中,人们对从 2D 建模到 3D 建模的兴趣发生了转变,因为大多数问题的基础物理都是 3D 的。尽管现代计算机的计算能力增强,允许使用更大的 FEM 模型,但在精细的 3D FEM 模型上重建 3D 图像所需的计算时间可能会显著增加,大约需要几个小时。例如,在使用具有 50 万个元素的密集 3D FEM 网格的电阻抗断层成像(EIT)应用中,单次重建迭代使用优化例程在现代多核 PC 上运行大约需要 15-20 分钟。加速图像重建以实现研究人员更轻松、更快速地探索数据和重建参数是很理想的。此外,为一些有前途的 EIT 临床应用提供高速重建是必不可少的。对于 3D 问题,计算时间的 70%用于构建雅可比矩阵,25%用于正向求解。在这项工作中,我们专注于通过使用单个和多个 GPU 来加速雅可比矩阵的计算。首先,我们在现代多核 PC 架构上讨论了一种优化实现,并展示了计算时间如何受到 CPU 到内存带宽的限制;这个因素限制了 CPU 可以获取数据的速度。由于数据操作数无法快速获取,以至于即使是单个 CPU 内核的处理能力也无法充分利用,因此与使用多个 CPU 内核相关的收益也微不足道。与 CPU 相比,GPU 具有更快的内存带宽和更好的并行性。我们能够在单个 NVIDIA S1070 GPU 上获得 20 倍的加速因子,在四个 GPU 上获得 50 倍的加速因子,将精细 3D 网格的雅可比矩阵计算时间从 12 分钟缩短到 14 秒。我们认为这是在 3D 成像中获得交互式重建时间的重要一步,特别是在未来与正向问题的加速相结合时。虽然我们展示了 EIT 的结果,但这些结果适用于任何使用伴随方法计算雅可比矩阵的软场成像模式。