The Howard Hughes Medical Institute and W.M. Keck Advanced Microscopy Laboratory, Department of Biochemistry and Biophysics, University of California, San Francisco, 600, 16th Street, CA 94158-2517, USA.
Ultramicroscopy. 2011 Jul;111(8):1137-43. doi: 10.1016/j.ultramic.2011.03.015. Epub 2011 Apr 1.
Full resolution electron microscopic tomographic (EMT) reconstruction of large-scale tilt series requires significant computing power. The desire to perform multiple cycles of iterative reconstruction and realignment dramatically increases the pressing need to improve reconstruction performance. This has motivated us to develop a distributed multi-GPU (graphics processing unit) system to provide the required computing power for rapid constrained, iterative reconstructions of very large three-dimensional (3D) volumes. The participating GPUs reconstruct segments of the volume in parallel, and subsequently, the segments are assembled to form the complete 3D volume. Owing to its power and versatility, the CUDA (NVIDIA, USA) platform was selected for GPU implementation of the EMT reconstruction. For a system containing 10 GPUs provided by 5 GTX295 cards, 10 cycles of SIRT reconstruction for a tomogram of 4096(2) × 512 voxels from an input tilt series containing 122 projection images of 4096(2) pixels (single precision float) takes a total of 1845 s of which 1032 s are for computation with the remainder being the system overhead. The same system takes only 39 s total to reconstruct 1024(2) × 256 voxels from 122 1024(2) pixel projections. While the system overhead is non-trivial, performance analysis indicates that adding extra GPUs to the system would lead to steadily enhanced overall performance. Therefore, this system can be easily expanded to generate superior computing power for very large tomographic reconstructions and especially to empower iterative cycles of reconstruction and realignment.
大角度倾斜系列的全分辨率电子显微镜断层重建(EMT)需要大量的计算能力。进行多次迭代重建和重新对准的愿望极大地增加了提高重建性能的迫切需求。这促使我们开发了一个分布式多 GPU(图形处理单元)系统,为快速约束迭代重建非常大的三维(3D)体积提供所需的计算能力。参与的 GPU 以并行方式重建体积的片段,然后将这些片段组装起来形成完整的 3D 体积。由于其功能强大且用途广泛,因此选择 NVIDIA 的 CUDA(美国)平台来实现 EMT 重建的 GPU 实现。对于包含 5 个 GTX295 卡提供的 10 个 GPU 的系统,对于包含 122 个 4096(2)像素投影图像的输入倾斜系列的 4096(2)×512 体素的断层图像进行 10 个 SIRT 重建循环,总共需要 1845 秒,其中 1032 秒用于计算,其余为系统开销。同一系统总共只需 39 秒即可重建来自 122 个 1024(2)像素投影的 1024(2)×256 体素。虽然系统开销不小,但性能分析表明,向系统添加额外的 GPU 将导致整体性能稳步提高。因此,该系统可以轻松扩展,为非常大的断层重建生成更高的计算能力,特别是为重建和重新对准的迭代循环提供支持。