Suppr超能文献

用于递归神经网络的模拟电阻交叉阵列中的高效非线性函数逼近

Efficient nonlinear function approximation in analog resistive crossbars for recurrent neural networks.

作者信息

Yang Junyi, Mao Ruibin, Jiang Mingrui, Cheng Yichuan, Sun Pao-Sheng Vincent, Dong Shuai, Pedretti Giacomo, Sheng Xia, Ignowski Jim, Li Haoliang, Li Can, Basu Arindam

机构信息

Department of Electrical Engineering, City University of Hong Kong, Hong Kong SAR, China.

Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong SAR, China.

出版信息

Nat Commun. 2025 Jan 29;16(1):1136. doi: 10.1038/s41467-025-56254-6.

Abstract

Analog In-memory Computing (IMC) has demonstrated energy-efficient and low latency implementation of convolution and fully-connected layers in deep neural networks (DNN) by using physics for computing in parallel resistive memory arrays. However, recurrent neural networks (RNN) that are widely used for speech-recognition and natural language processing have tasted limited success with this approach. This can be attributed to the significant time and energy penalties incurred in implementing nonlinear activation functions that are abundant in such models. In this work, we experimentally demonstrate the implementation of a non-linear activation function integrated with a ramp analog-to-digital conversion (ADC) at the periphery of the memory to improve in-memory implementation of RNNs. Our approach uses an extra column of memristors to produce an appropriately pre-distorted ramp voltage such that the comparator output directly approximates the desired nonlinear function. We experimentally demonstrate programming different nonlinear functions using a memristive array and simulate its incorporation in RNNs to solve keyword spotting and language modelling tasks. Compared to other approaches, we demonstrate manifold increase in area-efficiency, energy-efficiency and throughput due to the in-memory, programmable ramp generator that removes digital processing overhead.

摘要

模拟内存计算(IMC)通过在并行电阻式内存阵列中利用物理原理进行计算,已在深度神经网络(DNN)中实现了卷积层和全连接层的高能效和低延迟。然而,广泛用于语音识别和自然语言处理的循环神经网络(RNN)在这种方法上取得的成功有限。这可归因于在实现此类模型中大量存在的非线性激活函数时所产生的显著时间和能量损耗。在这项工作中,我们通过实验证明了在内存外围集成了斜坡模数转换(ADC)的非线性激活函数的实现,以改进RNN的内存实现。我们的方法使用额外一列忆阻器来产生适当预失真的斜坡电压,使得比较器输出直接近似所需的非线性函数。我们通过实验展示了使用忆阻阵列编程不同的非线性函数,并模拟其在RNN中的合并以解决关键词识别和语言建模任务。与其他方法相比,由于去除了数字处理开销的内存中可编程斜坡发生器,我们展示了面积效率、能量效率和吞吐量的大幅提高。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/34cc/11779922/362a8e2b9b50/41467_2025_56254_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验