Dhull Seema, Misba Walid Al, Nisar Arshid, Atulasimha Jayasimha, Kaushik Brajesh Kumar
IEEE Trans Neural Netw Learn Syst. 2025 Mar;36(3):4996-5005. doi: 10.1109/TNNLS.2024.3369969. Epub 2025 Feb 28.
The quantization of synaptic weights using emerging nonvolatile memory (NVM) devices has emerged as a promising solution to implement computationally efficient neural networks on resource constrained hardware. However, the practical implementation of such synaptic weights is hampered by the imperfect memory characteristics, specifically the availability of limited number of quantized states and the presence of large intrinsic device variation and stochasticity involved in writing the synaptic states. This article presents ON-chip training and inference of a neural network using quantized magnetic domain wall (DW)-based synaptic array and CMOS peripheral circuits. A rigorous model of the magnetic DW device considering stochasticity and process variations has been utilized for the synapse. To achieve stable quantized weights, DW pinning has been achieved by means of physical constrictions. Finally, VGG8 architecture for CIFAR-10 image classification has been simulated by using the extracted synaptic device characteristics. The performance in terms of accuracy, energy, latency, and area consumption has been evaluated while considering the process variations and nonidealities in the DW device as well as the peripheral circuits. The proposed quantized neural network (QNN) architecture achieves efficient ON-chip learning with 92.4% and 90.4% training and inference accuracy, respectively. In comparison to pure CMOS-based design, it demonstrates an overall improvement in area, energy, and latency by , , and , respectively.
利用新兴的非易失性存储器(NVM)设备对突触权重进行量化,已成为在资源受限硬件上实现计算高效神经网络的一种有前途的解决方案。然而,这种突触权重的实际实现受到不完善的存储器特性的阻碍,特别是有限数量的量化状态的可用性以及写入突触状态时所涉及的大量固有器件变化和随机性。本文介绍了使用基于量化磁畴壁(DW)的突触阵列和CMOS外围电路对神经网络进行片上训练和推理。已将考虑随机性和工艺变化的磁DW器件的严格模型用于突触。为了实现稳定的量化权重,通过物理约束实现了DW钉扎。最后,利用提取的突触器件特性对用于CIFAR-10图像分类的VGG8架构进行了仿真。在考虑DW器件以及外围电路中的工艺变化和非理想性的同时,对准确性、能量、延迟和面积消耗方面的性能进行了评估。所提出的量化神经网络(QNN)架构分别以92.4%和90.4%的训练和推理准确率实现了高效的片上学习。与基于纯CMOS的设计相比,它在面积、能量和延迟方面分别有、和的总体改进。