College of Electrical and Computer Engineering, Chungbuk National University, Cheongju, 28644, S. Korea.
School of Electronic Engineering, University of Michigan, Ann Arbor, Michigan, 48109, USA.
J Nanosci Nanotechnol. 2021 Mar 1;21(3):1854-1861. doi: 10.1166/jnn.2021.18925.
There are many challenges in the hardware implementation of a neural network using nanoscale memristor crossbar arrays where the use of analog cells is concerned. Multi-state or analog cells introduce more stringent noise margins, which are difficult to adhere to in light of variability. We propose a potential solution using a 1-bit memristor that stores binary values "0" or "1" with their memristive states, denoted as a high-resistance state (HRS) and a low-resistance state (LRS). In addition, we propose a new architecture consisting of 4-parallel 1-bit memristors at each crosspoint on the array. The four 1-bit memristors connected in parallel represent 5 decimal values according to the number of activated memristors. This is then mapped to a synaptic weight, which corresponds to the state of an artificial neuron in a neural network. We implement a convolutional neural network (CNN) model on a framework (tensorflow) using an equivalent quantized weight mapping model that demonstrates learning results almost identical to a high-precision CNN model. This radix-5 CNN is mapped to hardware on the proposed parallel-connected memristor crossbar array. Also, we propose a method for negative weight representation on a memristor crossbar array. Then, we verify the CNN hardware on an edge-AI (e-AI) platform, developed on a field-programmable gate array (FPGA). In this e-AI platform, we represent five weights per crosspoint using CLB logics. We test the learning results of the CNN hardware using an e-AI platform with a dataset consisting of 4×4 images in three classes. We verify the functionality of our radix-5 CNN implementation showing comparable classification accuracy to high-precision use cases, with reduction of the area of the memristor crossbar array by half, all verified on a FPGA. Implementing the CNN model on the FPGA board can contribute to the practical use of edge-AI.
在使用纳米尺度忆阻器交叉阵列实现神经网络时,模拟单元的使用存在许多挑战。多态或模拟单元引入了更严格的噪声裕量,这在考虑到可变性时很难遵守。我们提出了一种使用 1 位忆阻器的潜在解决方案,该忆阻器使用其忆阻状态存储二进制值“0”或“1”,分别表示为高阻状态(HRS)和低阻状态(LRS)。此外,我们提出了一种新的架构,该架构由阵列上每个交叉点的 4 个并行 1 位忆阻器组成。连接成并行的四个 1 位忆阻器根据激活的忆阻器数量表示 5 个十进制值。然后将其映射到突触权重,该权重对应于神经网络中人工神经元的状态。我们在一个框架(tensorflow)上实现了一个卷积神经网络(CNN)模型,使用等效的量化权重映射模型,该模型展示了与高精度 CNN 模型几乎相同的学习结果。这个基 5 的 CNN 被映射到所提出的并行连接忆阻器交叉阵列的硬件上。此外,我们提出了一种在忆阻器交叉阵列上表示负权重的方法。然后,我们在基于现场可编程门阵列(FPGA)开发的边缘人工智能(e-AI)平台上验证了 CNN 硬件。在这个 e-AI 平台上,我们使用 CLB 逻辑为每个交叉点表示五个权重。我们使用包含三个类的 4×4 图像数据集在 e-AI 平台上测试 CNN 硬件的学习结果。我们验证了我们的基 5 CNN 实现的功能,展示了与高精度用例相当的分类精度,同时将忆阻器交叉阵列的面积减少了一半,所有这些都在 FPGA 上进行了验证。在 FPGA 板上实现 CNN 模型可以为边缘人工智能的实际应用做出贡献。