• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用奖励性基于时间的突触可塑性来学习觅食任务的多层网络。

Multi-layer network utilizing rewarded spike time dependent plasticity to learn a foraging task.

作者信息

Sanda Pavel, Skorheim Steven, Bazhenov Maxim

机构信息

Department of Medicine, University of California, San Diego, La Jolla, California, United States of America.

Information and Systems Sciences Lab, HRL Laboratories, LLC, Malibu, California, United States of America.

出版信息

PLoS Comput Biol. 2017 Sep 29;13(9):e1005705. doi: 10.1371/journal.pcbi.1005705. eCollection 2017 Sep.

DOI:10.1371/journal.pcbi.1005705
PMID:28961245
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5636167/
Abstract

Neural networks with a single plastic layer employing reward modulated spike time dependent plasticity (STDP) are capable of learning simple foraging tasks. Here we demonstrate advanced pattern discrimination and continuous learning in a network of spiking neurons with multiple plastic layers. The network utilized both reward modulated and non-reward modulated STDP and implemented multiple mechanisms for homeostatic regulation of synaptic efficacy, including heterosynaptic plasticity, gain control, output balancing, activity normalization of rewarded STDP and hard limits on synaptic strength. We found that addition of a hidden layer of neurons employing non-rewarded STDP created neurons that responded to the specific combinations of inputs and thus performed basic classification of the input patterns. When combined with a following layer of neurons implementing rewarded STDP, the network was able to learn, despite the absence of labeled training data, discrimination between rewarding patterns and the patterns designated as punishing. Synaptic noise allowed for trial-and-error learning that helped to identify the goal-oriented strategies which were effective in task solving. The study predicts a critical set of properties of the spiking neuronal network with STDP that was sufficient to solve a complex foraging task involving pattern classification and decision making.

摘要

具有单个采用奖励调制的基于脉冲时间的可塑性(STDP)的可塑性层的神经网络能够学习简单的觅食任务。在这里,我们展示了在具有多个可塑性层的脉冲神经元网络中的高级模式辨别和持续学习。该网络利用了奖励调制和非奖励调制的STDP,并实现了多种用于突触效能稳态调节的机制,包括异突触可塑性、增益控制、输出平衡、奖励STDP的活动归一化以及对突触强度的硬限制。我们发现,添加一层采用无奖励STDP的隐藏神经元会产生对特定输入组合做出反应的神经元,从而对输入模式进行基本分类。当与随后一层实施奖励STDP的神经元相结合时,尽管没有标记的训练数据,该网络仍能够学习区分奖励模式和被指定为惩罚的模式。突触噪声允许进行试错学习,这有助于识别在任务解决中有效的面向目标的策略。该研究预测了具有STDP的脉冲神经网络的一组关键特性,这些特性足以解决涉及模式分类和决策的复杂觅食任务。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/942c/5636167/12ec75f7a93b/pcbi.1005705.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/942c/5636167/378082a9fe08/pcbi.1005705.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/942c/5636167/4c3d18c454dd/pcbi.1005705.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/942c/5636167/8c6e20a3ea5b/pcbi.1005705.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/942c/5636167/f1f750b53905/pcbi.1005705.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/942c/5636167/7dcf32168547/pcbi.1005705.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/942c/5636167/edf31a93eb9f/pcbi.1005705.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/942c/5636167/2f9c56d39513/pcbi.1005705.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/942c/5636167/12ec75f7a93b/pcbi.1005705.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/942c/5636167/378082a9fe08/pcbi.1005705.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/942c/5636167/4c3d18c454dd/pcbi.1005705.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/942c/5636167/8c6e20a3ea5b/pcbi.1005705.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/942c/5636167/f1f750b53905/pcbi.1005705.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/942c/5636167/7dcf32168547/pcbi.1005705.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/942c/5636167/edf31a93eb9f/pcbi.1005705.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/942c/5636167/2f9c56d39513/pcbi.1005705.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/942c/5636167/12ec75f7a93b/pcbi.1005705.g008.jpg

相似文献

1
Multi-layer network utilizing rewarded spike time dependent plasticity to learn a foraging task.利用奖励性基于时间的突触可塑性来学习觅食任务的多层网络。
PLoS Comput Biol. 2017 Sep 29;13(9):e1005705. doi: 10.1371/journal.pcbi.1005705. eCollection 2017 Sep.
2
A spiking network model of decision making employing rewarded STDP.一种采用奖励性尖峰时间依赖可塑性的决策尖峰网络模型。
PLoS One. 2014 Mar 14;9(3):e90821. doi: 10.1371/journal.pone.0090821. eCollection 2014.
3
A learning theory for reward-modulated spike-timing-dependent plasticity with application to biofeedback.一种用于奖励调制的依赖于尖峰时间的可塑性的学习理论及其在生物反馈中的应用。
PLoS Comput Biol. 2008 Oct;4(10):e1000180. doi: 10.1371/journal.pcbi.1000180. Epub 2008 Oct 10.
4
Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity.通过调节尖峰时间依赖性突触可塑性进行强化学习。
Neural Comput. 2007 Jun;19(6):1468-502. doi: 10.1162/neco.2007.19.6.1468.
5
Competitive Learning in a Spiking Neural Network: Towards an Intelligent Pattern Classifier.尖峰神经网络中的竞争学习:迈向智能模式分类器。
Sensors (Basel). 2020 Jan 16;20(2):500. doi: 10.3390/s20020500.
6
Spectral analysis of input spike trains by spike-timing-dependent plasticity.基于时间依赖的可塑性对输入尖峰序列的频谱分析。
PLoS Comput Biol. 2012;8(7):e1002584. doi: 10.1371/journal.pcbi.1002584. Epub 2012 Jul 5.
7
Mirrored STDP Implements Autoencoder Learning in a Network of Spiking Neurons.镜像脉冲时间依赖可塑性在脉冲神经元网络中实现自动编码器学习。
PLoS Comput Biol. 2015 Dec 3;11(12):e1004566. doi: 10.1371/journal.pcbi.1004566. eCollection 2015 Dec.
8
Reinforcement learning with modulated spike timing dependent synaptic plasticity.基于调制的尖峰时间依赖突触可塑性的强化学习。
J Neurophysiol. 2007 Dec;98(6):3648-65. doi: 10.1152/jn.00364.2007. Epub 2007 Oct 10.
9
An STDP training algorithm for a spiking neural network with dynamic threshold neurons.一种具有动态阈值神经元的尖峰神经网络的 STDP 训练算法。
Int J Neural Syst. 2010 Dec;20(6):463-80. doi: 10.1142/S0129065710002553.
10
What can a neuron learn with spike-timing-dependent plasticity?神经元通过尖峰时间依赖性可塑性能够学习什么?
Neural Comput. 2005 Nov;17(11):2337-82. doi: 10.1162/0899766054796888.

引用本文的文献

1
Synaptic plasticity: from chimera states to synchronicity oscillations in multilayer neural networks.突触可塑性:从多层神经网络中的嵌合态到同步振荡
Cogn Neurodyn. 2024 Dec;18(6):3715-3726. doi: 10.1007/s11571-024-10158-1. Epub 2024 Jul 30.
2
A brain-inspired theory of mind spiking neural network improves multi-agent cooperation and competition.一种受大脑启发的心智脉冲神经网络理论可改善多智能体合作与竞争。
Patterns (N Y). 2023 Jun 23;4(8):100775. doi: 10.1016/j.patter.2023.100775. eCollection 2023 Aug 11.
3
Nature-inspired self-organizing collision avoidance for drone swarm based on reward-modulated spiking neural network.

本文引用的文献

1
DeepStack: Expert-level artificial intelligence in heads-up no-limit poker.深筹码:单人无限注德州扑克中的专家级人工智能。
Science. 2017 May 5;356(6337):508-513. doi: 10.1126/science.aam6960. Epub 2017 Mar 2.
2
Hebbian plasticity requires compensatory processes on multiple timescales.赫布可塑性需要在多个时间尺度上进行补偿过程。
Philos Trans R Soc Lond B Biol Sci. 2017 Mar 5;372(1715). doi: 10.1098/rstb.2016.0259.
3
Partial Breakdown of Input Specificity of STDP at Individual Synapses Promotes New Learning.单个突触处STDP输入特异性的部分瓦解促进新的学习。
基于奖励调制脉冲神经网络的无人机群自然启发式自组织避碰
Patterns (N Y). 2022 Oct 28;3(11):100611. doi: 10.1016/j.patter.2022.100611. eCollection 2022 Nov 11.
4
Sleep prevents catastrophic forgetting in spiking neural networks by forming a joint synaptic weight representation.睡眠通过形成联合突触权重表示来防止尖峰神经网络中的灾难性遗忘。
PLoS Comput Biol. 2022 Nov 18;18(11):e1010628. doi: 10.1371/journal.pcbi.1010628. eCollection 2022 Nov.
5
Training spiking neuronal networks to perform motor control using reinforcement and evolutionary learning.利用强化学习和进化学习训练脉冲神经网络以执行运动控制。
Front Comput Neurosci. 2022 Sep 30;16:1017284. doi: 10.3389/fncom.2022.1017284. eCollection 2022.
6
Training a spiking neuronal network model of visual-motor cortex to play a virtual racket-ball game using reinforcement learning.使用强化学习训练视觉运动皮层的尖峰神经元网络模型来进行虚拟球拍游戏。
PLoS One. 2022 May 11;17(5):e0265808. doi: 10.1371/journal.pone.0265808. eCollection 2022.
7
Recurrent Spiking Neural Network Learning Based on a Competitive Maximization of Neuronal Activity.基于神经元活动竞争最大化的递归脉冲神经网络学习
Front Neuroinform. 2018 Nov 15;12:79. doi: 10.3389/fninf.2018.00079. eCollection 2018.
J Neurosci. 2016 Aug 24;36(34):8842-55. doi: 10.1523/JNEUROSCI.0552-16.2016.
4
Network Homeostasis and State Dynamics of Neocortical Sleep.新皮质睡眠的网络稳态与状态动态
Neuron. 2016 May 18;90(4):839-52. doi: 10.1016/j.neuron.2016.03.036. Epub 2016 Apr 28.
5
Dynamic Balance of Excitation and Inhibition in Human and Monkey Neocortex.人类和猴新皮层中兴奋与抑制的动态平衡
Sci Rep. 2016 Mar 16;6:23176. doi: 10.1038/srep23176.
6
Mastering the game of Go with deep neural networks and tree search.用深度神经网络和树搜索掌握围棋游戏。
Nature. 2016 Jan 28;529(7587):484-9. doi: 10.1038/nature16961.
7
Modeling of Age-Dependent Epileptogenesis by Differential Homeostatic Synaptic Scaling.通过差异性稳态突触缩放对年龄依赖性癫痫发生的建模
J Neurosci. 2015 Sep 30;35(39):13448-62. doi: 10.1523/JNEUROSCI.5038-14.2015.
8
Homeostatic role of heterosynaptic plasticity: models and experiments.异突触可塑性的稳态作用:模型与实验
Front Comput Neurosci. 2015 Jul 13;9:89. doi: 10.3389/fncom.2015.00089. eCollection 2015.
9
Neuronal Reward and Decision Signals: From Theories to Data.神经元奖励与决策信号:从理论到数据
Physiol Rev. 2015 Jul;95(3):853-951. doi: 10.1152/physrev.00023.2014.
10
Deep learning.深度学习。
Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.