Suppr超能文献

动态采样率:在图形应用中利用帧相干性实现高能效GPU

Dynamic sampling rate: harnessing frame coherence in graphics applications for energy-efficient GPUs.

作者信息

Anglada Martí, de Lucas Enrique, Parcerisa Joan-Manuel, Aragón Juan L, González Antonio

机构信息

Departament d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Jordi Girona 1-3, Barcelona, 08034 Spain.

Imagination Technologies, Imagination House, King's Langley, WD4 8LZ UK.

出版信息

J Supercomput. 2022;78(13):14940-14964. doi: 10.1007/s11227-022-04413-7. Epub 2022 Apr 10.

Abstract

In real-time rendering, a 3D scene is modelled with meshes of triangles that the GPU projects to the screen. They are discretized by sampling each triangle at regular space intervals to generate fragments which are then added texture and lighting effects by a shader program. Realistic scenes require detailed geometric models, complex shaders, high-resolution displays and high screen refreshing rates, which all come at a great compute time and energy cost. This cost is often dominated by the fragment shader, which runs for each sampled fragment. Conventional GPUs sample the triangles once per pixel; however, there are many screen regions containing low variation that produce identical fragments and could be sampled at lower than pixel-rate with no loss in quality. Additionally, as temporal frame coherence makes consecutive frames very similar, such variations are usually maintained from frame to frame. This work proposes Dynamic Sampling Rate (DSR), a novel hardware mechanism to reduce redundancy and improve the energy efficiency in graphics applications. DSR analyzes the spatial frequencies of the scene once it has been rendered. Then, it leverages the temporal coherence in consecutive frames to decide, for each region of the screen, the lowest sampling rate to employ in the next frame that maintains image quality. We evaluate the performance of a state-of-the-art mobile GPU architecture extended with DSR for a wide variety of applications. Experimental results show that DSR is able to remove most of the redundancy inherent in the color computations at fragment granularity, which brings average speedups of 1.68x and energy savings of 40%.

摘要

在实时渲染中,3D场景是用三角形网格建模的,GPU将这些三角形网格投影到屏幕上。通过以规则的空间间隔对每个三角形进行采样来离散化它们,以生成片段,然后由着色器程序为这些片段添加纹理和光照效果。逼真的场景需要详细的几何模型、复杂的着色器、高分辨率显示器和高屏幕刷新率,而这一切都需要巨大的计算时间和能源成本。这种成本通常由片段着色器主导,它会为每个采样片段运行。传统GPU每像素对三角形采样一次;然而,存在许多变化较小的屏幕区域,这些区域会产生相同的片段,可以以低于像素速率的频率进行采样而不会损失质量。此外,由于时间帧相干性使得连续帧非常相似,这种变化通常会逐帧保持。这项工作提出了动态采样率(DSR),这是一种新颖的硬件机制,用于减少图形应用中的冗余并提高能源效率。DSR在场景渲染完成后分析其空间频率。然后,它利用连续帧中的时间相干性,为屏幕的每个区域决定在下一帧中采用的最低采样率,以保持图像质量。我们评估了一种扩展了DSR的先进移动GPU架构在各种应用中的性能。实验结果表明,DSR能够消除片段粒度上色计算中固有的大部分冗余,平均加速比为1.68倍,节能40%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/875b/9360083/88c784128865/11227_2022_4413_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验