Suppr超能文献

基于硬件约束的奖励学习——使用嵌入神经形态基板的精简指令集处理器。

Reward-based learning under hardware constraints-using a RISC processor embedded in a neuromorphic substrate.

机构信息

Kirchhoff Institute for Physics, Ruprecht-Karls-University Heidelberg Heidelberg, Germany.

出版信息

Front Neurosci. 2013 Sep 20;7:160. doi: 10.3389/fnins.2013.00160. eCollection 2013.

Abstract

In this study, we propose and analyze in simulations a new, highly flexible method of implementing synaptic plasticity in a wafer-scale, accelerated neuromorphic hardware system. The study focuses on globally modulated STDP, as a special use-case of this method. Flexibility is achieved by embedding a general-purpose processor dedicated to plasticity into the wafer. To evaluate the suitability of the proposed system, we use a reward modulated STDP rule in a spike train learning task. A single layer of neurons is trained to fire at specific points in time with only the reward as feedback. This model is simulated to measure its performance, i.e., the increase in received reward after learning. Using this performance as baseline, we then simulate the model with various constraints imposed by the proposed implementation and compare the performance. The simulated constraints include discretized synaptic weights, a restricted interface between analog synapses and embedded processor, and mismatch of analog circuits. We find that probabilistic updates can increase the performance of low-resolution weights, a simple interface between analog synapses and processor is sufficient for learning, and performance is insensitive to mismatch. Further, we consider communication latency between wafer and the conventional control computer system that is simulating the environment. This latency increases the delay, with which the reward is sent to the embedded processor. Because of the time continuous operation of the analog synapses, delay can cause a deviation of the updates as compared to the not delayed situation. We find that for highly accelerated systems latency has to be kept to a minimum. This study demonstrates the suitability of the proposed implementation to emulate the selected reward modulated STDP learning rule. It is therefore an ideal candidate for implementation in an upgraded version of the wafer-scale system developed within the BrainScaleS project.

摘要

在这项研究中,我们提出并在模拟中分析了一种新的、高度灵活的方法,用于在晶圆级加速的神经形态硬件系统中实现突触可塑性。该研究侧重于全局调制的 STDP,作为该方法的一个特殊用例。通过在晶圆中嵌入一个专门用于可塑性的通用处理器来实现灵活性。为了评估所提出系统的适用性,我们在尖峰时间学习任务中使用了奖励调制的 STDP 规则。一层神经元被训练在特定的时间点发射,只有奖励作为反馈。模拟该模型以衡量其性能,即学习后收到的奖励增加。使用此性能作为基准,我们然后模拟了具有所提出实现施加的各种约束的模型,并比较了性能。模拟的约束包括离散化的突触权重、模拟突触和嵌入式处理器之间受限的接口以及模拟电路的失配。我们发现概率更新可以提高低分辨率权重的性能,模拟突触和处理器之间简单的接口足以进行学习,并且性能对失配不敏感。此外,我们还考虑了晶圆与模拟环境的传统控制计算机系统之间的通信延迟。该延迟增加了奖励发送到嵌入式处理器的延迟。由于模拟突触的时间连续操作,延迟会导致与未延迟情况相比更新的偏差。我们发现对于高度加速的系统,延迟必须保持在最低水平。这项研究表明,所提出的实现方法适合模拟所选的奖励调制 STDP 学习规则。因此,它是在 BrainScaleS 项目开发的晶圆级系统的升级版本中实现的理想候选者。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2234/3778319/ae2749a37ffa/fnins-07-00160-g0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验