Zhejiang Lab, Hangzhou, Zhejiang, China.
College of Science and Engineering, Hamad Bin Khalifa University, Doha 5855, Qatar.
Neural Netw. 2024 Jul;175:106312. doi: 10.1016/j.neunet.2024.106312. Epub 2024 Apr 15.
In recent years, there has been a significant advancement in memristor-based neural networks, positioning them as a pivotal processing-in-memory deployment architecture for a wide array of deep learning applications. Within this realm of progress, the emerging parallel analog memristive platforms are prominent for their ability to generate multiple feature maps in a single processing cycle. However, a notable limitation is that they are specifically tailored for neural networks with fixed structures. As an orthogonal direction, recent research reveals that neural architecture should be specialized for tasks and deployment platforms. Building upon this, the neural architecture search (NAS) methods effectively explore promising architectures in a large design space. However, these NAS-based architectures are generally heterogeneous and diversified, making it challenging for deployment on current single-prototype, customized, parallel analog memristive hardware circuits. Therefore, investigating memristive analog deployment that overrides the full search space is a promising and challenging problem. Inspired by this, and beginning with the DARTS search space, we study the memristive hardware design of primitive operations and propose the memristive all-inclusive hypernetwork that covers 2×10 network architectures. Our computational simulation results on 3 representative architectures (DARTS-V1, DARTS-V2, PDARTS) show that our memristive all-inclusive hypernetwork achieves promising results on the CIFAR10 dataset (89.2% of PDARTS with 8-bit quantization precision), and is compatible with all architectures in the DARTS full-space. The hardware performance simulation indicates that the memristive all-inclusive hypernetwork costs slightly more resource consumption (nearly the same in power, 22%∼25% increase in Latency, 1.5× in Area) relative to the individual deployment, which is reasonable and may reach a tolerable trade-off deployment scheme for industrial scenarios.
近年来,基于忆阻器的神经网络取得了显著进展,成为广泛的深度学习应用中一种重要的处理内存储部署架构。在这一进步领域中,新兴的并行模拟忆阻平台因其能够在单个处理周期内生成多个特征图而备受关注。然而,一个显著的局限性是,它们专门针对具有固定结构的神经网络。作为一个正交方向,最近的研究表明,神经网络架构应该针对任务和部署平台进行专门设计。在此基础上,神经架构搜索 (NAS) 方法有效地在大型设计空间中探索有前途的架构。然而,这些基于 NAS 的架构通常是异构和多样化的,因此在当前的单原型、定制、并行模拟忆阻硬件电路上进行部署具有挑战性。因此,研究超越全搜索空间的模拟忆阻部署是一个有前途和具有挑战性的问题。受此启发,我们从 DARTS 搜索空间开始,研究了基本操作的忆阻硬件设计,并提出了涵盖 2×10 个网络架构的全包含忆阻超网络。我们在 3 个代表性架构 (DARTS-V1、DARTS-V2、PDARTS) 上的计算模拟结果表明,我们的全包含忆阻超网络在 CIFAR10 数据集上取得了有前途的结果 (89.2%的 PDARTS 具有 8 位量化精度),并且与 DARTS 全空间中的所有架构兼容。硬件性能模拟表明,全包含忆阻超网络的资源消耗略高 (功率几乎相同,延迟增加 22%∼25%,面积增加 1.5 倍),与单独部署相比,这是合理的,并且可能达到工业场景可接受的权衡部署方案。