相似文献

1

A blueprint for precise and fault-tolerant analog neural networks.

Nat Commun. 2024 Jun 14;15(1):5098. doi: 10.1038/s41467-024-49324-8.

2

Hybrid Precision Floating-Point (HPFP) Selection to Optimize Hardware-Constrained Accelerator for CNN Training.

Sensors (Basel). 2024 Mar 27;24(7):2145. doi: 10.3390/s24072145.

3

Robustness of spiking Deep Belief Networks to noise and reduced bit precision of neuro-inspired hardware platforms.

Front Neurosci. 2015 Jul 9;9:222. doi: 10.3389/fnins.2015.00222. eCollection 2015.

4

Bulk-Switching Memristor-Based Compute-In-Memory Module for Deep Neural Network Training.

Adv Mater. 2023 Nov;35(46):e2305465. doi: 10.1002/adma.202305465. Epub 2023 Oct 15.

5

Designing Efficient Bit-Level Sparsity-Tolerant Memristive Networks.

IEEE Trans Neural Netw Learn Syst. 2024 Sep;35(9):11979-11988. doi: 10.1109/TNNLS.2023.3250437. Epub 2024 Sep 3.

6

Enabling Training of Neural Networks on Noisy Hardware.

Front Artif Intell. 2021 Sep 9;4:699148. doi: 10.3389/frai.2021.699148. eCollection 2021.

7

Bitstream-Based Neural Network for Scalable, Efficient, and Accurate Deep Learning Hardware.

Front Neurosci. 2020 Dec 23;14:543472. doi: 10.3389/fnins.2020.543472. eCollection 2020.

8

Cost-effective stochastic MAC circuits for deep neural networks.

Neural Netw. 2019 Sep;117:152-162. doi: 10.1016/j.neunet.2019.04.017. Epub 2019 May 20.

9

SCA: Search-Based Computing Hardware Architecture with Precision Scalable and Computation Reconfigurable Scheme.

Sensors (Basel). 2022 Nov 6;22(21):8545. doi: 10.3390/s22218545.

10

LRMP: Layer Replication with Mixed Precision for spatial in-memory DNN accelerators.

Front Artif Intell. 2024 Oct 4;7:1268317. doi: 10.3389/frai.2024.1268317. eCollection 2024.

本文引用的文献

1

Experimentally realized in situ backpropagation for deep learning in photonic neural networks.

Science. 2023 Apr 28;380(6643):398-404. doi: 10.1126/science.ade8450. Epub 2023 Apr 27.

2

A Modulo-Based Architecture for Analog-to-Digital Conversion.

IEEE J Sel Top Signal Process. 2018 Oct;12(5):825-840. doi: 10.1109/jstsp.2018.2863189. Epub 2018 Aug 6.

3

11 TOPS photonic convolutional accelerator for optical neural networks.

Nature. 2021 Jan;589(7840):44-51. doi: 10.1038/s41586-020-03063-0. Epub 2021 Jan 6.

4

Parallel convolutional processing using an integrated photonic tensor core.

Nature. 2021 Jan;589(7840):52-58. doi: 10.1038/s41586-020-03070-1. Epub 2021 Jan 6.

5

Fully hardware-implemented memristor convolutional neural network.

Nature. 2020 Jan;577(7792):641-646. doi: 10.1038/s41586-020-1942-4. Epub 2020 Jan 29.

6

Neuromorphic photonic networks using silicon photonic weight banks.

Sci Rep. 2017 Aug 7;7(1):7430. doi: 10.1038/s41598-017-07754-z.

7

High-speed compact silicon photonic Michelson interferometric modulator.

Opt Express. 2014 Nov 3;22(22):26788-802. doi: 10.1364/OE.22.026788.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

文档翻译

学术文献翻译模型，支持多种主流文档格式。

精确且容错的模拟神经网络蓝图。

A blueprint for precise and fault-tolerant analog neural networks.

作者信息

Demirkiran Cansu, Nair Lakshmi, Bunandar Darius, Joshi Ajay

机构信息

Boston University, Boston, MA, USA.

Lightmatter, Boston, MA, USA.

出版信息

Nat Commun. 2024 Jun 14;15(1):5098. doi: 10.1038/s41467-024-49324-8.

DOI:10.1038/s41467-024-49324-8

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11178814/

Abstract

Analog computing has reemerged as a promising avenue for accelerating deep neural networks (DNNs) to overcome the scalability challenges posed by traditional digital architectures. However, achieving high precision using analog technologies is challenging, as high-precision data converters are costly and impractical. In this work, we address this challenge by using the residue number system (RNS) and composing high-precision operations from multiple low-precision operations, thereby eliminating the need for high-precision data converters and information loss. Our study demonstrates that the RNS-based approach can achieve ≥99% FP32 accuracy with 6-bit integer arithmetic for DNN inference and 7-bit for DNN training. The reduced precision requirements imply that using RNS can achieve several orders of magnitude higher energy efficiency while maintaining the same throughput compared to conventional analog hardware with the same precision. We also present a fault-tolerant dataflow using redundant RNS to protect the computation against noise and errors inherent within analog hardware.

摘要

模拟计算作为加速深度神经网络（DNN）以克服传统数字架构所带来的可扩展性挑战的一种有前途的途径，已再度兴起。然而，使用模拟技术实现高精度具有挑战性，因为高精度数据转换器成本高昂且不切实际。在这项工作中，我们通过使用余数系统（RNS）并由多个低精度运算组成高精度运算来应对这一挑战，从而消除了对高精度数据转换器的需求以及信息损失。我们的研究表明，基于RNS的方法在DNN推理中使用6位整数运算、在DNN训练中使用7位整数运算时，可实现≥99%的FP32精度。与具有相同精度的传统模拟硬件相比，精度要求的降低意味着使用RNS可以在保持相同吞吐量的同时实现几个数量级更高的能源效率。我们还提出了一种使用冗余RNS的容错数据流，以保护计算免受模拟硬件中固有的噪声和错误影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5da3/11178814/57054b07a839/41467_2024_49324_Fig1_HTML.jpg