Rui Ran, Li Hao, Tu Yi-Cheng
University of South Florida, Tampa, Florida, USA.
Proceedings VLDB Endowment. 2020 Dec;14(4):708-720. doi: 10.14778/3436905.3436927. Epub 2020 Dec 1.
Relational join processing is one of the core functionalities in database management systems. It has been demonstrated that GPUs as a general-purpose parallel computing platform is very promising in processing relational joins. However, join algorithms often need to handle very large input data, which is an issue that was not sufficiently addressed in existing work. Besides, as more and more desktop and workstation platforms support multi-GPU environment, the combined computing capability of multiple GPUs can easily achieve that of a computing cluster. It is worth exploring how join processing would benefit from the adaptation of multiple GPUs. We identify the low rate and complex patterns of data transfer among the CPU and GPUs as the main challenges in designing efficient algorithms for large table joins. To overcome such challenges, we propose three distinctive designs of multi-GPU join algorithms, namely, the nested loop, global sort-merge and hybrid joins for large table joins with different join conditions. Extensive experiments running on multiple databases and two different hardware configurations demonstrate high scalability of our algorithms over data size and significant performance boost brought by the use of multiple GPUs. Furthermore, our algorithms achieve much better performance as compared to existing join algorithms, with a speedup up to 25X and 2.8X over best known code developed for multi-core CPUs and GPUs respectively.
关系连接处理是数据库管理系统的核心功能之一。事实证明,图形处理器(GPU)作为通用并行计算平台,在处理关系连接方面很有前景。然而,连接算法通常需要处理非常大的输入数据,而这一问题在现有工作中并未得到充分解决。此外,随着越来越多的桌面和工作站平台支持多GPU环境,多个GPU的组合计算能力很容易达到计算集群的水平。探索连接处理如何从多GPU的应用中受益是值得的。我们将CPU和GPU之间低速率且复杂的数据传输模式确定为设计大型表连接高效算法的主要挑战。为了克服这些挑战,我们针对具有不同连接条件的大型表连接,提出了三种独特的多GPU连接算法设计,即嵌套循环、全局排序合并和混合连接。在多个数据库和两种不同硬件配置上进行的大量实验表明,我们的算法在数据大小方面具有很高的可扩展性,并且使用多个GPU带来了显著的性能提升。此外,与现有连接算法相比,我们的算法性能要好得多,分别比为多核CPU和GPU开发的最佳已知代码加速25倍和2.8倍。