• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用电阻式交叉点器件训练长短期记忆网络

Training LSTM Networks With Resistive Cross-Point Devices.

作者信息

Gokmen Tayfun, Rasch Malte J, Haensch Wilfried

机构信息

IBM Research AI, Yorktown Heights, NY, United States.

出版信息

Front Neurosci. 2018 Oct 24;12:745. doi: 10.3389/fnins.2018.00745. eCollection 2018.

DOI:10.3389/fnins.2018.00745
PMID:30405334
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6207602/
Abstract

In our previous work we have shown that resistive cross point devices, so called resistive processing unit (RPU) devices, can provide significant power and speed benefits when training deep fully connected networks as well as convolutional neural networks. In this work, we further extend the RPU concept for training recurrent neural networks (RNNs) namely LSTMs. We show that the mapping of recurrent layers is very similar to the mapping of fully connected layers and therefore the RPU concept can potentially provide large acceleration factors for RNNs as well. In addition, we study the effect of various device imperfections and system parameters on training performance. Symmetry of updates becomes even more crucial for RNNs; already a few percent asymmetry results in an increase in the test error compared to the ideal case trained with floating point numbers. Furthermore, the input signal resolution to the device arrays needs to be at least 7 bits for successful training. However, we show that a stochastic rounding scheme can reduce the input signal resolution back to 5 bits. Further, we find that RPU device variations and hardware noise are enough to mitigate overfitting, so that there is less need for using dropout. Here we attempt to study the validity of the RPU approach by simulating large scale networks. For instance, the models studied here are roughly 1500 times larger than the more often studied multilayer perceptron models trained on the MNIST dataset in terms of the total number of multiplication and summation operations performed per epoch.

摘要

在我们之前的工作中,我们已经表明,电阻式交叉点器件,即所谓的电阻处理单元(RPU)器件,在训练深度全连接网络以及卷积神经网络时,可以提供显著的功率和速度优势。在这项工作中,我们进一步扩展了RPU概念,用于训练循环神经网络(RNN),即长短期记忆网络(LSTM)。我们表明,循环层的映射与全连接层的映射非常相似,因此RPU概念也有可能为RNN提供较大的加速因子。此外,我们研究了各种器件缺陷和系统参数对训练性能的影响。更新的对称性对RNN来说变得更加关键;与使用浮点数训练的理想情况相比,仅百分之几的不对称就会导致测试误差增加。此外,为了成功训练,输入到器件阵列的信号分辨率至少需要7位。然而,我们表明,一种随机舍入方案可以将输入信号分辨率降低回5位。此外,我们发现RPU器件的变化和硬件噪声足以减轻过拟合,因此使用随机失活的必要性降低。在这里,我们试图通过模拟大规模网络来研究RPU方法的有效性。例如,就每个epoch执行的乘法和加法运算的总数而言,这里研究的模型比在MNIST数据集上训练的更常被研究的多层感知器模型大约大1500倍。

相似文献

1
Training LSTM Networks With Resistive Cross-Point Devices.使用电阻式交叉点器件训练长短期记忆网络
Front Neurosci. 2018 Oct 24;12:745. doi: 10.3389/fnins.2018.00745. eCollection 2018.
2
Training Deep Convolutional Neural Networks with Resistive Cross-Point Devices.使用电阻式交叉点器件训练深度卷积神经网络。
Front Neurosci. 2017 Oct 10;11:538. doi: 10.3389/fnins.2017.00538. eCollection 2017.
3
Algorithm for Training Neural Networks on Resistive Device Arrays.用于在电阻式器件阵列上训练神经网络的算法
Front Neurosci. 2020 Feb 26;14:103. doi: 10.3389/fnins.2020.00103. eCollection 2020.
4
Acceleration of Deep Neural Network Training with Resistive Cross-Point Devices: Design Considerations.利用电阻式交叉点器件加速深度神经网络训练:设计考量
Front Neurosci. 2016 Jul 21;10:333. doi: 10.3389/fnins.2016.00333. eCollection 2016.
5
Enabling Training of Neural Networks on Noisy Hardware.在有噪声的硬件上实现神经网络训练。
Front Artif Intell. 2021 Sep 9;4:699148. doi: 10.3389/frai.2021.699148. eCollection 2021.
6
A Post-training Quantization Method for the Design of Fixed-Point-Based FPGA/ASIC Hardware Accelerators for LSTM/GRU Algorithms.一种针对 LSTM/GRU 算法的基于定点的 FPGA/ASIC 硬件加速器设计的后训练量化方法。
Comput Intell Neurosci. 2022 May 11;2022:9485933. doi: 10.1155/2022/9485933. eCollection 2022.
7
Analog Resistive Switching Devices for Training Deep Neural Networks with the Novel Tiki-Taka Algorithm.用于采用新型Tiki-Taka算法训练深度神经网络的模拟电阻开关器件
Nano Lett. 2024 Jan 24;24(3):866-872. doi: 10.1021/acs.nanolett.3c03697. Epub 2024 Jan 11.
8
Mixed-Precision Deep Learning Based on Computational Memory.基于计算内存的混合精度深度学习
Front Neurosci. 2020 May 12;14:406. doi: 10.3389/fnins.2020.00406. eCollection 2020.
9
On-Chip Training Spiking Neural Networks Using Approximated Backpropagation With Analog Synaptic Devices.使用带有模拟突触器件的近似反向传播的片上训练脉冲神经网络。
Front Neurosci. 2020 Jul 7;14:423. doi: 10.3389/fnins.2020.00423. eCollection 2020.
10
Impact of Asymmetric Weight Update on Neural Network Training With Tiki-Taka Algorithm.非对称权重更新对基于蒂基-塔卡算法的神经网络训练的影响。
Front Neurosci. 2022 Jan 6;15:767953. doi: 10.3389/fnins.2021.767953. eCollection 2021.

引用本文的文献

1
Fast and robust analog in-memory deep neural network training.快速且稳健的模拟内存深度神经网络训练
Nat Commun. 2024 Aug 20;15(1):7133. doi: 10.1038/s41467-024-51221-z.
2
Energy-based analog neural network framework.基于能量的模拟神经网络框架。
Front Comput Neurosci. 2023 Mar 3;17:1114651. doi: 10.3389/fncom.2023.1114651. eCollection 2023.
3
Neural Network Training With Asymmetric Crosspoint Elements.使用非对称交叉点元件的神经网络训练

本文引用的文献

1
Equivalent-accuracy accelerated neural-network training using analogue memory.利用模拟内存实现等效精度的加速神经网络训练。
Nature. 2018 Jun;558(7708):60-67. doi: 10.1038/s41586-018-0180-5. Epub 2018 Jun 6.
2
Training Deep Convolutional Neural Networks with Resistive Cross-Point Devices.使用电阻式交叉点器件训练深度卷积神经网络。
Front Neurosci. 2017 Oct 10;11:538. doi: 10.3389/fnins.2017.00538. eCollection 2017.
3
Li-Ion Synaptic Transistor for Low Power Analog Computing.锂离子突触晶体管用于低功耗模拟计算。
Front Artif Intell. 2022 May 9;5:891624. doi: 10.3389/frai.2022.891624. eCollection 2022.
4
Enabling Training of Neural Networks on Noisy Hardware.在有噪声的硬件上实现神经网络训练。
Front Artif Intell. 2021 Sep 9;4:699148. doi: 10.3389/frai.2021.699148. eCollection 2021.
5
Utilizing the Switching Stochasticity of HfO/TiO-Based ReRAM Devices and the Concept of Multiple Device Synapses for the Classification of Overlapping and Noisy Patterns.利用基于HfO/TiO的阻变随机存取存储器(ReRAM)器件的开关随机性以及多器件突触概念对重叠和噪声模式进行分类。
Front Neurosci. 2021 Jun 7;15:661856. doi: 10.3389/fnins.2021.661856. eCollection 2021.
6
Optogenetics inspired transition metal dichalcogenide neuristors for in-memory deep recurrent neural networks.受光遗传学启发的过渡金属二卤化物忆阻器用于内存中的深度递归神经网络。
Nat Commun. 2020 Jun 25;11(1):3211. doi: 10.1038/s41467-020-16985-0.
7
Mixed-Precision Deep Learning Based on Computational Memory.基于计算内存的混合精度深度学习
Front Neurosci. 2020 May 12;14:406. doi: 10.3389/fnins.2020.00406. eCollection 2020.
8
Algorithm for Training Neural Networks on Resistive Device Arrays.用于在电阻式器件阵列上训练神经网络的算法
Front Neurosci. 2020 Feb 26;14:103. doi: 10.3389/fnins.2020.00103. eCollection 2020.
9
Stochastic rounding and reduced-precision fixed-point arithmetic for solving neural ordinary differential equations.随机舍入和降低精度定点算术在求解神经常微分方程中的应用。
Philos Trans A Math Phys Eng Sci. 2020 Mar 6;378(2166):20190052. doi: 10.1098/rsta.2019.0052. Epub 2020 Jan 20.
10
Streaming Batch Eigenupdates for Hardware Neural Networks.用于硬件神经网络的流式批特征更新
Front Neurosci. 2019 Aug 6;13:793. doi: 10.3389/fnins.2019.00793. eCollection 2019.
Adv Mater. 2017 Jan;29(4). doi: 10.1002/adma.201604310. Epub 2016 Nov 22.
4
Deep Visual-Semantic Alignments for Generating Image Descriptions.深度视觉-语义对齐生成图像描述。
IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):664-676. doi: 10.1109/TPAMI.2016.2598339. Epub 2016 Aug 5.
5
Acceleration of Deep Neural Network Training with Resistive Cross-Point Devices: Design Considerations.利用电阻式交叉点器件加速深度神经网络训练:设计考量
Front Neurosci. 2016 Jul 21;10:333. doi: 10.3389/fnins.2016.00333. eCollection 2016.
6
Energy Scaling Advantages of Resistive Memory Crossbar Based Computation and Its Application to Sparse Coding.基于电阻式存储器交叉开关计算的能量缩放优势及其在稀疏编码中的应用。
Front Neurosci. 2016 Jan 6;9:484. doi: 10.3389/fnins.2015.00484. eCollection 2015.
7
Deep learning.深度学习。
Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.
8
Training and operation of an integrated neuromorphic network based on metal-oxide memristors.基于金属氧化物忆阻器的集成神经形态网络的训练和操作。
Nature. 2015 May 7;521(7550):61-4. doi: 10.1038/nature14441.
9
Long short-term memory.长短期记忆
Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.