Suppr超能文献

使用GPU优化的Spark进行分布式交互式可视化

Distributed Interactive Visualization Using GPU-Optimized Spark.

作者信息

Hong Sumin, Choi Junyoung, Jeong Won-Ki

出版信息

IEEE Trans Vis Comput Graph. 2021 Sep;27(9):3670-3684. doi: 10.1109/TVCG.2020.2990894. Epub 2021 Jul 29.

Abstract

With the advent of advances in imaging and computing technologies, large-scale data acquisition and processing have become commonplace in many science and engineering disciplines. Conventional workflows for large-scale data processing usually rely on in-house or commercial software that are designed for domain-specific computing tasks. Recent advances in MapReduce, which was originally developed for batch processing textual data via a simplified programming model of the map and reduce functions, have expanded its applications to more general tasks in big-data processing, such as scientific computing, and biomedical image processing. However, as shown in previous work, volume rendering and visualization using MapReduce is still considered challenging and impractical owing to the disk-based, batch-processing nature of its computing model. In this article, contrary to this common belief, we show that the MapReduce computing model can be effectively used for interactive visualization. Our proposed system is a novel extension of Spark, one of the most popular open-source MapReduce frameworks, which offers GPU-accelerated MapReduce computing. To minimize CPU-GPU communication and overcome slow, disk-based shuffle performance, the proposed system supports GPU in-memory caching and MPI-based direct communication between compute nodes. To allow for GPU-accelerated in-situ visualization using raster graphics in Spark, we leveraged the CUDA-OpenGL interoperability, resulting in faster processing speeds by several orders of magnitude compared to conventional MapReduce systems. We demonstrate the performance of our system via several volume processing and visualization tasks, such as direct volume rendering, iso-surface extraction, and numerical simulations with in-situ visualization.

摘要

随着成像和计算技术的进步,大规模数据采集和处理在许多科学和工程学科中已变得司空见惯。大规模数据处理的传统工作流程通常依赖于为特定领域计算任务设计的内部或商业软件。MapReduce最初是为通过map和reduce函数的简化编程模型对文本数据进行批处理而开发的,其最新进展已将其应用扩展到大数据处理中的更一般任务,如科学计算和生物医学图像处理。然而,如先前工作所示,由于其计算模型基于磁盘的批处理性质,使用MapReduce进行体绘制和可视化仍被认为具有挑战性且不切实际。在本文中,与这种普遍看法相反,我们表明MapReduce计算模型可有效地用于交互式可视化。我们提出的系统是最流行的开源MapReduce框架之一Spark的新颖扩展,它提供GPU加速的MapReduce计算。为了最小化CPU与GPU之间的通信并克服基于磁盘的缓慢混洗性能,所提出的系统支持GPU内存缓存以及计算节点之间基于MPI的直接通信。为了在Spark中使用光栅图形实现GPU加速的原位可视化,我们利用了CUDA与OpenGL的互操作性,与传统的MapReduce系统相比,处理速度提高了几个数量级。我们通过几个体处理和可视化任务展示了我们系统的性能,如直接体绘制、等值面提取以及带有原位可视化的数值模拟。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验