Pugmire David, Monroe Laura, Connor Davenport Carolyn, DuBois Andrew, DuBois David, Poole Stephen
Los Alamos National Laboratory, Los Alamos, NM 87545, USA.
IEEE Trans Vis Comput Graph. 2007 Jul-Aug;13(4):798-809. doi: 10.1109/TVCG.2007.1026.
This paper describes the first use of a Network Processing Unit (NPU) to perform hardware-based image composition in a distributed rendering system. The image composition step is a notorious bottleneck in a clustered rendering system. Furthermore, image compositing algorithms do not necessarily scale as data size and number of nodes increase. Previous researchers have addressed the composition problem via software and/or custom-built hardware. We used the heterogeneous multicore computation architecture of the Intel IXP28XX NPU, a fully programmable commercial off-the-shelf (COTS) technology, to perform the image composition step. With this design, we have attained a nearly four-times performance increase over traditional software-based compositing methods, achieving sustained compositing rates of 22-28 fps on a 1,024 x 1,024 image. This system is fully scalable with a negligible penalty in frame rate, is entirely COTS, and is flexible with regard to operating system, rendering software, graphics cards, and node architecture. The NPU-based compositor has the additional advantage of being a modular compositing component that is eminently suitable for integration into existing distributed software visualization packages.
本文介绍了首次在分布式渲染系统中使用网络处理单元(NPU)来执行基于硬件的图像合成。在集群渲染系统中,图像合成步骤是一个众所周知的瓶颈。此外,随着数据大小和节点数量的增加,图像合成算法不一定能实现扩展。先前的研究人员已通过软件和/或定制硬件来解决合成问题。我们使用英特尔IXP28XX NPU的异构多核计算架构(一种完全可编程的商用现货(COTS)技术)来执行图像合成步骤。通过这种设计,我们在性能上比传统的基于软件的合成方法提高了近四倍,在1024×1024图像上实现了22 - 28帧/秒的持续合成速率。该系统具有完全可扩展性,帧率损失可忽略不计,完全采用商用现货技术,并且在操作系统、渲染软件、图形卡和节点架构方面具有灵活性。基于NPU的合成器还有一个额外的优点,即它是一个模块化的合成组件,非常适合集成到现有的分布式软件可视化包中。