• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用模拟内存实现等效精度的加速神经网络训练。

Equivalent-accuracy accelerated neural-network training using analogue memory.

机构信息

IBM Research-Almaden, San Jose, CA, USA.

IBM Research-Zurich, Rueschlikon, Switzerland.

出版信息

Nature. 2018 Jun;558(7708):60-67. doi: 10.1038/s41586-018-0180-5. Epub 2018 Jun 6.

DOI:10.1038/s41586-018-0180-5
PMID:29875487
Abstract

Neural-network training can be slow and energy intensive, owing to the need to transfer the weight data for the network between conventional digital memory chips and processor chips. Analogue non-volatile memory can accelerate the neural-network training algorithm known as backpropagation by performing parallelized multiply-accumulate operations in the analogue domain at the location of the weight data. However, the classification accuracies of such in situ training using non-volatile-memory hardware have generally been less than those of software-based training, owing to insufficient dynamic range and excessive weight-update asymmetry. Here we demonstrate mixed hardware-software neural-network implementations that involve up to 204,900 synapses and that combine long-term storage in phase-change memory, near-linear updates of volatile capacitors and weight-data transfer with 'polarity inversion' to cancel out inherent device-to-device variations. We achieve generalization accuracies (on previously unseen data) equivalent to those of software-based training on various commonly used machine-learning test datasets (MNIST, MNIST-backrand, CIFAR-10 and CIFAR-100). The computational energy efficiency of 28,065 billion operations per second per watt and throughput per area of 3.6 trillion operations per second per square millimetre that we calculate for our implementation exceed those of today's graphical processing units by two orders of magnitude. This work provides a path towards hardware accelerators that are both fast and energy efficient, particularly on fully connected neural-network layers.

摘要

神经网络的训练可能会非常缓慢且耗能巨大,这是因为需要在传统数字存储芯片和处理器芯片之间传输网络的权重数据。模拟非易失性存储器可以通过在权重数据所在位置的模拟域中执行并行乘法累加操作,从而加速被称为反向传播的神经网络训练算法。然而,由于动态范围不足和权重更新的严重不对称,使用非易失性存储器硬件进行的这种原位训练的分类精度通常低于基于软件的训练。在这里,我们展示了混合硬件-软件神经网络实现,涉及多达 204900 个突触,并结合了相变存储器中的长期存储、易失性电容器的近线性更新以及带有“极性反转”的权重数据传输,以消除固有器件间的变化。我们在各种常用的机器学习测试数据集(MNIST、MNIST-backrand、CIFAR-10 和 CIFAR-100)上实现了与基于软件的训练相当的泛化精度(在未见数据上)。我们计算出的实现的计算能效为每秒每瓦 2806.5 万亿次运算和每平方毫米每秒 3.6 万亿次运算的吞吐量,比当今的图形处理单元高出两个数量级。这项工作为快速且节能的硬件加速器提供了一条途径,特别是在全连接神经网络层上。

相似文献

1
Equivalent-accuracy accelerated neural-network training using analogue memory.利用模拟内存实现等效精度的加速神经网络训练。
Nature. 2018 Jun;558(7708):60-67. doi: 10.1038/s41586-018-0180-5. Epub 2018 Jun 6.
2
Hardware implementation of backpropagation using progressive gradient descent for in situ training of multilayer neural networks.使用渐进梯度下降进行多层神经网络原位训练的反向传播的硬件实现。
Sci Adv. 2024 Jul 12;10(28):eado8999. doi: 10.1126/sciadv.ado8999.
3
Cost-effective stochastic MAC circuits for deep neural networks.用于深度神经网络的经济高效随机 MAC 电路。
Neural Netw. 2019 Sep;117:152-162. doi: 10.1016/j.neunet.2019.04.017. Epub 2019 May 20.
4
Parallel Training of Analog Neural Network Using Electrochemical Random-Access Memory.基于电化学随机存取存储器的模拟神经网络并行训练
Front Neurosci. 2021 Apr 8;15:636127. doi: 10.3389/fnins.2021.636127. eCollection 2021.
5
Bulk-Switching Memristor-Based Compute-In-Memory Module for Deep Neural Network Training.用于深度神经网络训练的基于批量开关忆阻器的内存计算模块
Adv Mater. 2023 Nov;35(46):e2305465. doi: 10.1002/adma.202305465. Epub 2023 Oct 15.
6
Optimised weight programming for analogue memory-based deep neural networks.优化基于模拟记忆的深度神经网络的权重编程。
Nat Commun. 2022 Jun 30;13(1):3765. doi: 10.1038/s41467-022-31405-1.
7
Direct Feedback Alignment With Sparse Connections for Local Learning.用于局部学习的具有稀疏连接的直接反馈对齐
Front Neurosci. 2019 May 24;13:525. doi: 10.3389/fnins.2019.00525. eCollection 2019.
8
Mixed-Precision Deep Learning Based on Computational Memory.基于计算内存的混合精度深度学习
Front Neurosci. 2020 May 12;14:406. doi: 10.3389/fnins.2020.00406. eCollection 2020.
9
Face classification using electronic synapses.基于电子突触的人脸分类。
Nat Commun. 2017 May 12;8:15199. doi: 10.1038/ncomms15199.
10
Memory-Efficient Deep Learning on a SpiNNaker 2 Prototype.基于SpiNNaker 2原型的内存高效深度学习
Front Neurosci. 2018 Nov 16;12:840. doi: 10.3389/fnins.2018.00840. eCollection 2018.

引用本文的文献

1
Neuromorphic Hebbian learning with magnetic tunnel junction synapses.基于磁性隧道结突触的神经形态赫布学习。
Commun Eng. 2025 Aug 4;4(1):142. doi: 10.1038/s44172-025-00479-2.
2
Advanced Design for High-Performance and AI Chips.高性能与人工智能芯片的先进设计
Nanomicro Lett. 2025 Jul 29;18(1):13. doi: 10.1007/s40820-025-01850-w.
3
The Role of Phase-Change Memory in Edge Computing and Analog In-Memory Computing: An Overview of Recent Research Contributions and Future Challenges.相变存储器在边缘计算和模拟内存计算中的作用:近期研究贡献与未来挑战综述
Sensors (Basel). 2025 Jun 9;25(12):3618. doi: 10.3390/s25123618.
4
Memristive floating-point Fourier neural operator network for efficient scientific modeling.用于高效科学建模的忆阻浮点傅里叶神经算子网络。
Sci Adv. 2025 Jun 20;11(25):eadv4446. doi: 10.1126/sciadv.adv4446.
5
Phase-Change Memory for In-Memory Computing.用于内存计算的相变存储器
Chem Rev. 2025 Jun 11;125(11):5163-5194. doi: 10.1021/acs.chemrev.4c00670. Epub 2025 May 22.
6
Resistive Switching Random-Access Memory (RRAM): Applications and Requirements for Memory and Computing.电阻式开关随机存取存储器(RRAM):存储器与计算的应用及要求
Chem Rev. 2025 Jun 25;125(12):5584-5625. doi: 10.1021/acs.chemrev.4c00845. Epub 2025 May 2.
7
Benchmarking Stochasticity behind Reproducibility: Denoising Strategies in TaO Memristors.重现性背后的随机性基准测试:氧化钽忆阻器中的去噪策略
ACS Appl Mater Interfaces. 2025 Apr 30;17(17):25654-25662. doi: 10.1021/acsami.5c00257. Epub 2025 Apr 19.
8
Nonvolatile Memristive Materials and Physical Modeling for In-Memory and In-Sensor Computing.用于内存和传感器内计算的非易失性忆阻材料与物理建模
Small Sci. 2024 Jan 22;4(3):2300139. doi: 10.1002/smsc.202300139. eCollection 2024 Mar.
9
Effect of Peierls-like distortions on transport in amorphous phase change devices.类派尔斯畸变对非晶态相变器件中输运的影响。
Commun Mater. 2025;6(1):56. doi: 10.1038/s43246-025-00776-5. Epub 2025 Mar 29.
10
8-bit states in 2D floating-gate memories using gate-injection mode for large-scale convolutional neural networks.用于大规模卷积神经网络的采用栅极注入模式的二维浮栅存储器中的8位状态
Nat Commun. 2025 Mar 18;16(1):2649. doi: 10.1038/s41467-025-58005-z.