Islam Riadul, Majurski Patrick, Kwon Jun, Sharma Anurag, Tummala Sri Ranga Sai Krishna
Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County, Baltimore, MD 21250, USA.
Sensors (Basel). 2024 Feb 19;24(4):1329. doi: 10.3390/s24041329.
Organizations managing high-performance computing systems face a multitude of challenges, including overarching concerns such as overall energy consumption, microprocessor clock frequency limitations, and the escalating costs associated with chip production. Evidently, processor speeds have plateaued over the last decade, persisting within the range of 2 GHz to 5 GHz. Scholars assert that brain-inspired computing holds substantial promise for mitigating these challenges. The spiking neural network (SNN) particularly stands out for its commendable power efficiency when juxtaposed with conventional design paradigms. Nevertheless, our scrutiny has brought to light several pivotal challenges impeding the seamless implementation of large-scale neural networks (NNs) on silicon. These challenges encompass the absence of automated tools, the need for multifaceted domain expertise, and the inadequacy of existing algorithms to efficiently partition and place extensive SNN computations onto hardware infrastructure. In this paper, we posit the development of an automated tool flow capable of transmuting any NN into an SNN. This undertaking involves the creation of a novel graph-partitioning algorithm designed to strategically place SNNs on a network-on-chip (NoC), thereby paving the way for future energy-efficient and high-performance computing paradigms. The presented methodology showcases its effectiveness by successfully transforming ANN architectures into SNNs with a marginal average error penalty of merely 2.65%. The proposed graph-partitioning algorithm enables a 14.22% decrease in inter-synaptic communication and an 87.58% reduction in intra-synaptic communication, on average, underscoring the effectiveness of the proposed algorithm in optimizing NN communication pathways. Compared to a baseline graph-partitioning algorithm, the proposed approach exhibits an average decrease of 79.74% in latency and a 14.67% reduction in energy consumption. Using existing NoC tools, the energy-latency product of SNN architectures is, on average, 82.71% lower than that of the baseline architectures.
管理高性能计算系统的组织面临着众多挑战,包括诸如整体能源消耗、微处理器时钟频率限制以及与芯片生产相关的成本不断上升等全局性问题。显然,在过去十年中处理器速度已趋于平稳,一直保持在2吉赫兹至5吉赫兹的范围内。学者们断言,受大脑启发的计算对于缓解这些挑战具有巨大潜力。与传统设计范式相比,脉冲神经网络(SNN)因其值得称赞的功率效率而格外突出。然而,我们的研究发现了几个阻碍在硅片上无缝实现大规模神经网络(NN)的关键挑战。这些挑战包括缺乏自动化工具、需要多方面的领域专业知识,以及现有算法不足以有效地将大量SNN计算分区并放置到硬件基础设施上。在本文中,我们提出开发一种能够将任何NN转换为SNN的自动化工具流程。这项工作涉及创建一种新颖的图分区算法,旨在将SNN策略性地放置在片上网络(NoC)上,从而为未来节能和高性能计算范式铺平道路。所提出的方法通过成功将人工神经网络(ANN)架构转换为SNN,平均误差惩罚仅为2.65%,展示了其有效性。所提出的图分区算法平均可使突触间通信减少14.22%,突触内通信减少87.58%,突出了该算法在优化NN通信路径方面的有效性。与基线图分区算法相比,所提出的方法平均延迟降低79.74%,能耗降低14.67%。使用现有的NoC工具,SNN架构的能量延迟积平均比基线架构低82.71%。