Harvey M J, De Fabritiis G, Giupponi G
Information and Communications Technologies, Imperial College London, South Kensington, London SW7 2AZ, United Kingdom.
Phys Rev E Stat Nonlin Soft Matter Phys. 2008 Nov;78(5 Pt 2):056702. doi: 10.1103/PhysRevE.78.056702. Epub 2008 Nov 6.
Accelerator processors like the new Cell processor are extending the traditional platforms for scientific computation, allowing orders of magnitude more floating-point operations per second (flops) compared to standard central processing units. However, they currently lack double-precision support and support for some IEEE 754 capabilities. In this work, we develop a lattice-Boltzmann (LB) code to run on the Cell processor and test the accuracy of this lattice method on this platform. We run tests for different flow topologies, boundary conditions, and Reynolds numbers in the range Re=6-350 . In one case, simulation results show a reduced mass and momentum conservation compared to an equivalent double-precision LB implementation. All other cases demonstrate the utility of the Cell processor for fluid dynamics simulations. Benchmarks on two Cell-based platforms are performed, the Sony Playstation3 and the QS20/QS21 IBM blade, obtaining a speed-up factor of 7 and 21, respectively, compared to the original PC version of the code, and a conservative sustained performance of 28 gigaflops per single Cell processor. Our results suggest that choice of IEEE 754 rounding mode is possibly as important as double-precision support for this specific scientific application.
像新型Cell处理器这样的加速器处理器正在扩展科学计算的传统平台,与标准中央处理器相比,每秒能执行数量级更多的浮点运算(flops)。然而,它们目前缺乏双精度支持以及对某些IEEE 754功能的支持。在这项工作中,我们开发了一种晶格玻尔兹曼(LB)代码,使其能在Cell处理器上运行,并在该平台上测试这种晶格方法的准确性。我们针对不同的流动拓扑结构、边界条件以及雷诺数范围Re = 6 - 350进行了测试。在一种情况下,与等效的双精度LB实现相比,模拟结果显示质量和动量守恒有所降低。所有其他情况都证明了Cell处理器在流体动力学模拟中的效用。我们在两个基于Cell的平台上进行了基准测试,即索尼Playstation3和QS20/QS21 IBM刀片服务器,与代码的原始PC版本相比,分别获得了7倍和21倍的加速因子,并且每个单个Cell处理器的保守持续性能为28吉次浮点运算。我们的结果表明,对于这个特定的科学应用,IEEE 754舍入模式的选择可能与双精度支持同样重要。