Suppr超能文献

基于并联连接的忆阻器交叉阵列的量子卷积神经网络在边缘 AI 平台上的实现。

Quantized Convolutional Neural Network Implementation on a Parallel-Connected Memristor Crossbar Array for Edge AI Platforms.

机构信息

College of Electrical and Computer Engineering, Chungbuk National University, Cheongju, 28644, S. Korea.

School of Electronic Engineering, University of Michigan, Ann Arbor, Michigan, 48109, USA.

出版信息

J Nanosci Nanotechnol. 2021 Mar 1;21(3):1854-1861. doi: 10.1166/jnn.2021.18925.

Abstract

There are many challenges in the hardware implementation of a neural network using nanoscale memristor crossbar arrays where the use of analog cells is concerned. Multi-state or analog cells introduce more stringent noise margins, which are difficult to adhere to in light of variability. We propose a potential solution using a 1-bit memristor that stores binary values "0" or "1" with their memristive states, denoted as a high-resistance state (HRS) and a low-resistance state (LRS). In addition, we propose a new architecture consisting of 4-parallel 1-bit memristors at each crosspoint on the array. The four 1-bit memristors connected in parallel represent 5 decimal values according to the number of activated memristors. This is then mapped to a synaptic weight, which corresponds to the state of an artificial neuron in a neural network. We implement a convolutional neural network (CNN) model on a framework (tensorflow) using an equivalent quantized weight mapping model that demonstrates learning results almost identical to a high-precision CNN model. This radix-5 CNN is mapped to hardware on the proposed parallel-connected memristor crossbar array. Also, we propose a method for negative weight representation on a memristor crossbar array. Then, we verify the CNN hardware on an edge-AI (e-AI) platform, developed on a field-programmable gate array (FPGA). In this e-AI platform, we represent five weights per crosspoint using CLB logics. We test the learning results of the CNN hardware using an e-AI platform with a dataset consisting of 4×4 images in three classes. We verify the functionality of our radix-5 CNN implementation showing comparable classification accuracy to high-precision use cases, with reduction of the area of the memristor crossbar array by half, all verified on a FPGA. Implementing the CNN model on the FPGA board can contribute to the practical use of edge-AI.

摘要

在使用纳米尺度忆阻器交叉阵列实现神经网络时,模拟单元的使用存在许多挑战。多态或模拟单元引入了更严格的噪声裕量,这在考虑到可变性时很难遵守。我们提出了一种使用 1 位忆阻器的潜在解决方案,该忆阻器使用其忆阻状态存储二进制值“0”或“1”,分别表示为高阻状态(HRS)和低阻状态(LRS)。此外,我们提出了一种新的架构,该架构由阵列上每个交叉点的 4 个并行 1 位忆阻器组成。连接成并行的四个 1 位忆阻器根据激活的忆阻器数量表示 5 个十进制值。然后将其映射到突触权重,该权重对应于神经网络中人工神经元的状态。我们在一个框架(tensorflow)上实现了一个卷积神经网络(CNN)模型,使用等效的量化权重映射模型,该模型展示了与高精度 CNN 模型几乎相同的学习结果。这个基 5 的 CNN 被映射到所提出的并行连接忆阻器交叉阵列的硬件上。此外,我们提出了一种在忆阻器交叉阵列上表示负权重的方法。然后,我们在基于现场可编程门阵列(FPGA)开发的边缘人工智能(e-AI)平台上验证了 CNN 硬件。在这个 e-AI 平台上,我们使用 CLB 逻辑为每个交叉点表示五个权重。我们使用包含三个类的 4×4 图像数据集在 e-AI 平台上测试 CNN 硬件的学习结果。我们验证了我们的基 5 CNN 实现的功能,展示了与高精度用例相当的分类精度,同时将忆阻器交叉阵列的面积减少了一半,所有这些都在 FPGA 上进行了验证。在 FPGA 板上实现 CNN 模型可以为边缘人工智能的实际应用做出贡献。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验