Wilcox Chris, Strout Michelle Mills, Bieman James M
Computer Science Department, Colorado State University, 1873 Campus Delivery, Fort Collins, CO 80523, USA Tel.: +1 970 491 5792;
Sci Program. 2011 Dec 5;19(4):213-229. doi: 10.3233/SPR-2011-0329.
A number of scientific applications are performance-limited by expressions that repeatedly call costly elementary functions. Lookup table (LUT) optimization accelerates the evaluation of such functions by reusing previously computed results. LUT methods can speed up applications that tolerate an approximation of function results, thereby achieving a high level of . One problem with LUT optimization is the difficulty of controlling the tradeoff between performance and accuracy. The current practice of manual LUT optimization adds programming effort by requiring extensive experimentation to make this tradeoff, and such hand tuning can obfuscate algorithms. In this paper we describe a methodology and tool implementation to improve the application of software LUT optimization. Our Mesa tool implements source-to-source transformations for C or C++ code to automate the tedious and error-prone aspects of LUT generation such as domain profiling, error analysis, and code generation. We evaluate Mesa with five scientific applications. Our results show a performance improvement of 3.0 × and 6.9 × for two molecular biology algorithms, 1.4 × for a molecular dynamics program, 2.1 × to 2.8 × for a neural network application, and 4.6 × for a hydrology calculation. We find that Mesa enables LUT optimization with more control over accuracy and less effort than manual approaches.
许多科学应用程序的性能受到反复调用代价高昂的基本函数的表达式的限制。查找表(LUT)优化通过重用先前计算的结果来加速此类函数的求值。LUT方法可以加速那些能够容忍函数结果近似值的应用程序,从而实现较高水平的[此处原文缺失具体内容]。LUT优化的一个问题是难以控制性能与准确性之间的权衡。当前手动进行LUT优化的做法需要通过大量实验来进行这种权衡,从而增加了编程工作量,而且这种手动调整可能会使算法变得晦涩难懂。在本文中,我们描述了一种方法和工具实现,以改进软件LUT优化的应用。我们的Mesa工具对C或C++代码实现了源到源的转换,以自动化LUT生成中繁琐且容易出错的方面,如域分析、误差分析和代码生成。我们用五个科学应用程序对Mesa进行了评估。我们的结果表明,对于两个分子生物学算法,性能提高了3.0倍和6.9倍;对于一个分子动力学程序,性能提高了1.4倍;对于一个神经网络应用程序,性能提高了2.1倍至2.8倍;对于一个水文计算,性能提高了4.6倍。我们发现,与手动方法相比,Mesa能够在对准确性有更多控制且工作量更少的情况下实现LUT优化。