Wellcome Centre for Integrative Neuroimaging - Centre for Functional Magnetic Resonance Imaging of the Brain (FMRIB), University of Oxford, Oxford, United Kingdom; Center for Biomedical Image Computing and Analytics (CBICA), Department of Radiology, University of Pennsylvania, Philadelphia, PA, United States.
Faculty of Information Technology and Bionics, Pazmany Peter Catholic University, Budapest, Hungary.
Neuroimage. 2019 Mar;188:598-615. doi: 10.1016/j.neuroimage.2018.12.015. Epub 2018 Dec 8.
The great potential of computational diffusion MRI (dMRI) relies on indirect inference of tissue microstructure and brain connections, since modelling and tractography frameworks map diffusion measurements to neuroanatomical features. This mapping however can be computationally highly expensive, particularly given the trend of increasing dataset sizes and the complexity in biophysical modelling. Limitations on computing resources can restrict data exploration and methodology development. A step forward is to take advantage of the computational power offered by recent parallel computing architectures, especially Graphics Processing Units (GPUs). GPUs are massive parallel processors that offer trillions of floating point operations per second, and have made possible the solution of computationally-intensive scientific problems that were intractable before. However, they are not inherently suited for all problems. Here, we present two different frameworks for accelerating dMRI computations using GPUs that cover the most typical dMRI applications: a framework for performing biophysical modelling and microstructure estimation, and a second framework for performing tractography and long-range connectivity estimation. The former provides a front-end and automatically generates a GPU executable file from a user-specified biophysical model, allowing accelerated non-linear model fitting in both deterministic and stochastic ways (Bayesian inference). The latter performs probabilistic tractography, can generate whole-brain connectomes and supports new functionality for imposing anatomical constraints, such as inherent consideration of surface meshes (GIFTI files) along with volumetric images. We validate the frameworks against well-established CPU-based implementations and we show that despite the very different challenges for parallelising these problems, a single GPU achieves better performance than 200 CPU cores thanks to our parallel designs.
计算扩散磁共振成像 (dMRI) 的巨大潜力依赖于对组织微观结构和大脑连接的间接推断,因为建模和追踪框架将扩散测量值映射到神经解剖学特征上。然而,这种映射在计算上可能非常昂贵,特别是考虑到数据集大小的增加趋势和生物物理建模的复杂性。计算资源的限制可能会限制数据探索和方法开发。向前迈进的一步是利用最近的并行计算架构提供的计算能力,特别是图形处理单元 (GPU)。GPU 是大规模并行处理器,每秒提供数万亿次浮点运算,并且已经能够解决以前难以解决的计算密集型科学问题。然而,它们并不天生适合所有问题。在这里,我们提出了两种使用 GPU 加速 dMRI 计算的不同框架,涵盖了最典型的 dMRI 应用:一种用于执行生物物理建模和微观结构估计的框架,以及一种用于执行追踪和长程连接估计的第二个框架。前者提供了一个前端,并根据用户指定的生物物理模型自动生成 GPU 可执行文件,允许以确定性和随机方式(贝叶斯推断)加速非线性模型拟合。后者执行概率追踪,可以生成全脑连接组,并支持施加解剖约束的新功能,例如固有考虑表面网格(GIFTI 文件)以及体积图像。我们针对基于 CPU 的成熟实现对这些框架进行了验证,并表明尽管这些问题在并行化方面面临着非常不同的挑战,但由于我们的并行设计,单个 GPU 的性能优于 200 个 CPU 内核。