Luce J, Hoggarth M, Lin J, Block A, Roeske J
Loyola University Medical Center, Maywood, IL.
Med Phys. 2012 Jun;39(6Part7):3673. doi: 10.1118/1.4734927.
To evaluate the efficiency gains obtained from using a Graphics Processing Unit (GPU) to perform a Fourier Transform (FT) based image registration.
Fourier-based image registration involves obtaining the FT of the component images, and analyzing them in Fourier space to determine the translations and rotations of one image set relative to another. An important property of FT registration is that by enlarging the images (adding additional pixels), one can obtain translations and rotations with sub-pixel resolution. The expense, however, is an increased computational time. GPUs may decrease the computational time associated with FT image registration by taking advantage of their parallel architecture to perform matrix computations much more efficiently than a Central Processor Unit (CPU). In order to evaluate the computational gains produced by a GPU, images with known translational shifts were utilized. A program was written in the Interactive Data Language (IDL; Exelis, Boulder, CO) to performCPU-based calculations. Subsequently, the program was modified using GPU bindings (Tech-X, Boulder, CO) to perform GPU-based computation on the same system. Multiple image sizes were used, ranging from 256×256 to 2304×2304. The time required to complete the full algorithm by the CPU and GPU were benchmarked and the speed increase was defined as the ratio of the CPU-to-GPU computational time.
The ratio of the CPU-to- GPU time was greater than 1.0 for all images, which indicates the GPU is performing the algorithm faster than the CPU. The smallest improvement, a 1.21 ratio, was found with the smallest image size of 256×256, and the largest speedup, a 4.25 ratio, was observed with the largest image size of 2304×2304.
GPU programming resulted in a significant decrease in computational time associated with a FT image registration algorithm. The inclusion of the GPU may provide near real-time, sub-pixel registration capability.
评估使用图形处理单元(GPU)执行基于傅里叶变换(FT)的图像配准所获得的效率提升。
基于傅里叶的图像配准包括获取各组成图像的傅里叶变换,并在傅里叶空间中对其进行分析,以确定一组图像相对于另一组图像的平移和旋转。傅里叶变换配准的一个重要特性是,通过放大图像(添加额外像素),可以获得亚像素分辨率的平移和旋转。然而,代价是计算时间增加。GPU可以利用其并行架构比中央处理器(CPU)更高效地执行矩阵计算,从而减少与傅里叶变换图像配准相关的计算时间。为了评估GPU产生的计算增益,使用了具有已知平移偏移的图像。用交互式数据语言(IDL;Exelis,科罗拉多州博尔德市)编写了一个程序来执行基于CPU的计算。随后,使用GPU绑定(Tech-X,科罗拉多州博尔德市)对该程序进行修改,以便在同一系统上执行基于GPU的计算。使用了多种图像尺寸,范围从256×256到2304×2304。对CPU和GPU完成完整算法所需的时间进行了基准测试,并将速度提升定义为CPU与GPU计算时间的比率。
对于所有图像,CPU与GPU时间的比率均大于1.0,这表明GPU执行算法的速度比CPU快。在最小图像尺寸256×256时,提升最小,比率为1.21;在最大图像尺寸2304×2304时,加速比最大,为4.25。
GPU编程显著减少了与傅里叶变换图像配准算法相关的计算时间。引入GPU可提供近实时的亚像素配准能力。