Suppr超能文献

用于大规模成像问题的多节点多GPU微分同胚图像配准

Multi-Node Multi-GPU Diffeomorphic Image Registration for Large-Scale Imaging Problems.

作者信息

Brunn Malte, Himthani Naveen, Biros George, Mehl Miriam, Mang Andreas

机构信息

Computer Science, University of Stuttgart, Stuttgart, DE.

Oden Institute, University of Texas, Austin TX, US.

出版信息

Int Conf High Perform Comput Netw Storage Anal. 2020 Nov;2020. doi: 10.1109/sc41405.2020.00042.

Abstract

We present a Gauss-Newton-Krylov solver for large deformation diffeomorphic image registration. We extend the publicly available CLAIRE library to multi-node multi-graphics processing unit (GPUs) systems and introduce novel algorithmic modifications that significantly improve performance. Our contributions comprise () a new preconditioner for the reduced-space Gauss-Newton Hessian system, () a highly-optimized multi-node multi-GPU implementation exploiting device direct communication for the main computational kernels (interpolation, high-order finite difference operators and Fast-Fourier-Transform), and () a comparison with state-of-the-art CPU and GPU implementations. We solve a 256-resolution image registration problem in five seconds on a single NVIDIA Tesla V100, with a performance speedup of 70% compared to the state-of-the-art. In our largest run, we register 2048 resolution images (25 B unknowns; approximately 152× larger than the largest problem solved in state-of-the-art GPU implementations) on 64 nodes with 256 GPUs on TACC's Longhorn system.

摘要

我们提出了一种用于大变形微分同胚图像配准的高斯-牛顿-克里洛夫求解器。我们将公开可用的CLAIRE库扩展到多节点多图形处理单元(GPU)系统,并引入了显著提高性能的新颖算法改进。我们的贡献包括:(1)一种用于降维空间高斯-牛顿海森矩阵系统的新预处理器;(2)一种高度优化的多节点多GPU实现,利用设备直接通信处理主要计算内核(插值、高阶有限差分算子和快速傅里叶变换);(3)与当前最先进的CPU和GPU实现进行比较。在单个NVIDIA Tesla V100上,我们在五秒内解决了一个256分辨率的图像配准问题,与当前最先进技术相比,性能提升了70%。在我们最大规模的运行中,我们在TACC的Longhorn系统上使用64个节点和256个GPU对2048分辨率的图像(250亿个未知数;比当前最先进的GPU实现中解决的最大问题大约大152倍)进行配准。

相似文献

1
Multi-Node Multi-GPU Diffeomorphic Image Registration for Large-Scale Imaging Problems.
Int Conf High Perform Comput Netw Storage Anal. 2020 Nov;2020. doi: 10.1109/sc41405.2020.00042.
2
Fast GPU 3D diffeomorphic image registration.
J Parallel Distrib Comput. 2021 Mar;149:149-162. doi: 10.1016/j.jpdc.2020.11.006. Epub 2020 Dec 10.
3
CLAIRE: Constrained Large Deformation Diffeomorphic Image Registration on Parallel Computing Architectures.
J Open Source Softw. 2021;6(61). doi: 10.21105/joss.03038. Epub 2021 May 30.
4
CLAIRE: A DISTRIBUTED-MEMORY SOLVER FOR CONSTRAINED LARGE DEFORMATION DIFFEOMORPHIC IMAGE REGISTRATION.
SIAM J Sci Comput. 2019;41(5):C548-C584. doi: 10.1137/18m1207818. Epub 2019 Oct 24.
5
Efficient methods for implementation of multi-level nonrigid mass-preserving image registration on GPUs and multi-threaded CPUs.
Comput Methods Programs Biomed. 2016 Apr;127:290-300. doi: 10.1016/j.cmpb.2015.12.018. Epub 2016 Jan 6.
6
A SEMI-LAGRANGIAN TWO-LEVEL PRECONDITIONED NEWTON-KRYLOV SOLVER FOR CONSTRAINED DIFFEOMORPHIC IMAGE REGISTRATION.
SIAM J Sci Comput. 2017;39(6):B1064-B1101. doi: 10.1137/16M1070475. Epub 2017 Nov 21.
7
An Inexact Newton-Krylov Algorithm for Constrained Diffeomorphic Image Registration.
SIAM J Imaging Sci. 2015;8(2):1030-1069. doi: 10.1137/140984002. Epub 2015 May 5.
8
A LAGRANGIAN GAUSS-NEWTON-KRYLOV SOLVER FOR MASS- AND INTENSITY-PRESERVING DIFFEOMORPHIC IMAGE REGISTRATION.
SIAM J Sci Comput. 2017;39(5):B860-B885. doi: 10.1137/17M1114132. Epub 2017 Sep 26.
9
Newton-Raphson preconditioner for Krylov type solvers on GPU devices.
Springerplus. 2016 Jun 21;5(1):788. doi: 10.1186/s40064-016-2346-7. eCollection 2016.
10
Accelerating B-spline interpolation on GPUs: Application to medical image registration.
Comput Methods Programs Biomed. 2020 Sep;193:105431. doi: 10.1016/j.cmpb.2020.105431. Epub 2020 Mar 3.

引用本文的文献

2
CLAIRE: Constrained Large Deformation Diffeomorphic Image Registration on Parallel Computing Architectures.
J Open Source Softw. 2021;6(61). doi: 10.21105/joss.03038. Epub 2021 May 30.
3
Down-sampling template curve to accelerate LDDMM-curve with application to shape analysis of the corpus callosum.
Healthc Technol Lett. 2021 May 2;8(3):78-83. doi: 10.1049/htl2.12011. eCollection 2021 Jun.

本文引用的文献

1
CLAIRE: A DISTRIBUTED-MEMORY SOLVER FOR CONSTRAINED LARGE DEFORMATION DIFFEOMORPHIC IMAGE REGISTRATION.
SIAM J Sci Comput. 2019;41(5):C548-C584. doi: 10.1137/18m1207818. Epub 2019 Oct 24.
2
Fast GPU 3D diffeomorphic image registration.
J Parallel Distrib Comput. 2021 Mar;149:149-162. doi: 10.1016/j.jpdc.2020.11.006. Epub 2020 Dec 10.
3
A community-developed open-source computational ecosystem for big neuro data.
Nat Methods. 2018 Nov;15(11):846-847. doi: 10.1038/s41592-018-0181-1.
4
A SEMI-LAGRANGIAN TWO-LEVEL PRECONDITIONED NEWTON-KRYLOV SOLVER FOR CONSTRAINED DIFFEOMORPHIC IMAGE REGISTRATION.
SIAM J Sci Comput. 2017;39(6):B1064-B1101. doi: 10.1137/16M1070475. Epub 2017 Nov 21.
5
Constrained -regularization schemes for diffeomorphic image registration.
SIAM J Imaging Sci. 2016;9(3):1154-1194. doi: 10.1137/15M1010919. Epub 2016 Aug 30.
6
Geodesic shape regression with multiple geometries and sparse parameters.
Med Image Anal. 2017 Jul;39:1-17. doi: 10.1016/j.media.2017.03.008. Epub 2017 Apr 5.
7
An Inexact Newton-Krylov Algorithm for Constrained Diffeomorphic Image Registration.
SIAM J Imaging Sci. 2015;8(2):1030-1069. doi: 10.1137/140984002. Epub 2015 May 5.
8
Efficient methods for implementation of multi-level nonrigid mass-preserving image registration on GPUs and multi-threaded CPUs.
Comput Methods Programs Biomed. 2016 Apr;127:290-300. doi: 10.1016/j.cmpb.2015.12.018. Epub 2016 Jan 6.
9
Morphometry of anatomical shape complexes with dense deformations and sparse parameters.
Neuroimage. 2014 Nov 1;101:35-49. doi: 10.1016/j.neuroimage.2014.06.043. Epub 2014 Jun 26.
10
Advanced CLARITY for rapid and high-resolution imaging of intact tissues.
Nat Protoc. 2014 Jul;9(7):1682-97. doi: 10.1038/nprot.2014.123. Epub 2014 Jun 19.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验