Xiao Yao, Nazarian Shahin, Bogdan Paul
University of Southern California, Los Angeles, CA, 90089, USA.
Sci Rep. 2023 Dec 19;13(1):22655. doi: 10.1038/s41598-023-48981-x.
The urgent need for low latency, high-compute and low power on-board intelligence in autonomous systems, cyber-physical systems, robotics, edge computing, evolvable computing, and complex data science calls for determining the optimal amount and type of specialized hardware together with reconfigurability capabilities. With these goals in mind, we propose a novel comprehensive graph analytics based high level synthesis (GAHLS) framework that efficiently analyzes complex high level programs through a combined compiler-based approach and graph theoretic optimization and synthesizes them into message passing domain-specific accelerators. This GAHLS framework first constructs a compiler-assisted dependency graph (CaDG) from low level virtual machine (LLVM) intermediate representation (IR) of high level programs and converts it into a hardware friendly description representation. Next, the GAHLS framework performs a memory design space exploration while account for the identified computational properties from the CaDG and optimizing the system performance for higher bandwidth. The GAHLS framework also performs a robust optimization to identify the CaDG subgraphs with similar computational structures and aggregate them into intelligent processing clusters in order to optimize the usage of underlying hardware resources. Finally, the GAHLS framework synthesizes this compressed specialized CaDG into processing elements while optimizing the system performance and area metrics. Evaluations of the GAHLS framework on several real-life applications (e.g., deep learning, brain machine interfaces) demonstrate that it provides 14.27× performance improvements compared to state-of-the-art approaches such as LegUp 6.2.
在自主系统、信息物理系统、机器人技术、边缘计算、可演化计算和复杂数据科学中,对低延迟、高计算能力和低功耗的板载智能有着迫切需求,这就需要确定专用硬件的最佳数量和类型以及可重构能力。考虑到这些目标,我们提出了一种新颖的基于综合图分析的高级合成(GAHLS)框架,该框架通过基于编译器的方法和图论优化相结合,有效地分析复杂的高级程序,并将它们合成到消息传递领域特定的加速器中。这个GAHLS框架首先从高级程序的低级虚拟机(LLVM)中间表示(IR)构建一个编译器辅助的依赖图(CaDG),并将其转换为硬件友好的描述表示。接下来,GAHLS框架在考虑从CaDG中识别出的计算属性并为更高带宽优化系统性能的同时,进行内存设计空间探索。GAHLS框架还进行了稳健的优化,以识别具有相似计算结构的CaDG子图,并将它们聚合到智能处理集群中,以优化底层硬件资源的使用。最后,GAHLS框架将这个压缩的专用CaDG合成到处理元件中,同时优化系统性能和面积指标。在几个实际应用(如深度学习、脑机接口)上对GAHLS框架进行的评估表明,与诸如LegUp 6.2等现有方法相比,它提供了14.27倍的性能提升。