• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用强化学习训练视觉运动皮层的尖峰神经元网络模型来进行虚拟球拍游戏。

Training a spiking neuronal network model of visual-motor cortex to play a virtual racket-ball game using reinforcement learning.

机构信息

Center for Biomedical Imaging and Neuromodulation, Nathan Kline Institute for Psychiatric Research, Orangeburg, New York, United States of America.

Dept. Physiology & Pharmacology, State University of New York Downstate, Brooklyn, New York, United States of America.

出版信息

PLoS One. 2022 May 11;17(5):e0265808. doi: 10.1371/journal.pone.0265808. eCollection 2022.

DOI:10.1371/journal.pone.0265808
PMID:35544518
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9094569/
Abstract

Recent models of spiking neuronal networks have been trained to perform behaviors in static environments using a variety of learning rules, with varying degrees of biological realism. Most of these models have not been tested in dynamic visual environments where models must make predictions on future states and adjust their behavior accordingly. The models using these learning rules are often treated as black boxes, with little analysis on circuit architectures and learning mechanisms supporting optimal performance. Here we developed visual/motor spiking neuronal network models and trained them to play a virtual racket-ball game using several reinforcement learning algorithms inspired by the dopaminergic reward system. We systematically investigated how different architectures and circuit-motifs (feed-forward, recurrent, feedback) contributed to learning and performance. We also developed a new biologically-inspired learning rule that significantly enhanced performance, while reducing training time. Our models included visual areas encoding game inputs and relaying the information to motor areas, which used this information to learn to move the racket to hit the ball. Neurons in the early visual area relayed information encoding object location and motion direction across the network. Neuronal association areas encoded spatial relationships between objects in the visual scene. Motor populations received inputs from visual and association areas representing the dorsal pathway. Two populations of motor neurons generated commands to move the racket up or down. Model-generated actions updated the environment and triggered reward or punishment signals that adjusted synaptic weights so that the models could learn which actions led to reward. Here we demonstrate that our biologically-plausible learning rules were effective in training spiking neuronal network models to solve problems in dynamic environments. We used our models to dissect the circuit architectures and learning rules most effective for learning. Our model shows that learning mechanisms involving different neural circuits produce similar performance in sensory-motor tasks. In biological networks, all learning mechanisms may complement one another, accelerating the learning capabilities of animals. Furthermore, this also highlights the resilience and redundancy in biological systems.

摘要

最近的尖峰神经元网络模型已经使用各种学习规则在静态环境中进行了行为训练,这些学习规则具有不同程度的生物学真实性。这些模型中的大多数都没有在动态视觉环境中进行测试,在动态视觉环境中,模型必须对未来状态进行预测,并相应地调整其行为。使用这些学习规则的模型通常被视为黑盒,对支持最佳性能的电路结构和学习机制几乎没有分析。在这里,我们开发了视觉/运动尖峰神经元网络模型,并使用几种受多巴胺能奖励系统启发的强化学习算法对其进行训练,以进行虚拟球拍球游戏。我们系统地研究了不同的架构和电路模式(前馈、递归、反馈)如何有助于学习和性能。我们还开发了一种新的基于生物学的学习规则,该规则显著提高了性能,同时减少了训练时间。我们的模型包括编码游戏输入并将信息传递到运动区的视觉区,运动区利用这些信息学习移动球拍击球。早期视觉区的神经元在网络中传递编码物体位置和运动方向的信息。神经元关联区编码视觉场景中物体之间的空间关系。运动区接收来自代表背侧通路的视觉和关联区的输入。两个运动神经元群体产生移动球拍上下的命令。模型生成的动作更新环境,并触发奖励或惩罚信号,调整突触权重,以便模型可以学习哪些动作会导致奖励。在这里,我们证明了我们基于生物学的学习规则在训练尖峰神经元网络模型以解决动态环境中的问题方面是有效的。我们使用我们的模型来剖析最有效的学习架构和学习规则,以进行学习。我们的模型表明,涉及不同神经电路的学习机制在感觉运动任务中产生相似的性能。在生物网络中,所有的学习机制可能相互补充,从而加速动物的学习能力。此外,这也突出了生物系统的弹性和冗余性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db80/9094569/553af7229d84/pone.0265808.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db80/9094569/3e21470b2fd8/pone.0265808.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db80/9094569/ea203cf233eb/pone.0265808.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db80/9094569/9b880fc51d11/pone.0265808.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db80/9094569/9abd8512e5e1/pone.0265808.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db80/9094569/cacea06d3037/pone.0265808.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db80/9094569/1f3cb080ae32/pone.0265808.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db80/9094569/7121c280e550/pone.0265808.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db80/9094569/2da8bcf0f465/pone.0265808.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db80/9094569/553af7229d84/pone.0265808.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db80/9094569/3e21470b2fd8/pone.0265808.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db80/9094569/ea203cf233eb/pone.0265808.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db80/9094569/9b880fc51d11/pone.0265808.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db80/9094569/9abd8512e5e1/pone.0265808.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db80/9094569/cacea06d3037/pone.0265808.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db80/9094569/1f3cb080ae32/pone.0265808.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db80/9094569/7121c280e550/pone.0265808.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db80/9094569/2da8bcf0f465/pone.0265808.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db80/9094569/553af7229d84/pone.0265808.g009.jpg

相似文献

1
Training a spiking neuronal network model of visual-motor cortex to play a virtual racket-ball game using reinforcement learning.使用强化学习训练视觉运动皮层的尖峰神经元网络模型来进行虚拟球拍游戏。
PLoS One. 2022 May 11;17(5):e0265808. doi: 10.1371/journal.pone.0265808. eCollection 2022.
2
Reinforcement learning of targeted movement in a spiking neuronal model of motor cortex.运动皮层尖峰神经元模型中目标运动的强化学习。
PLoS One. 2012;7(10):e47251. doi: 10.1371/journal.pone.0047251. Epub 2012 Oct 19.
3
A Simple Network Architecture Accounts for Diverse Reward Time Responses in Primary Visual Cortex.一种简单的网络架构解释了初级视觉皮层中不同的奖励时间响应。
J Neurosci. 2015 Sep 16;35(37):12659-72. doi: 10.1523/JNEUROSCI.0871-15.2015.
4
Mirrored STDP Implements Autoencoder Learning in a Network of Spiking Neurons.镜像脉冲时间依赖可塑性在脉冲神经元网络中实现自动编码器学习。
PLoS Comput Biol. 2015 Dec 3;11(12):e1004566. doi: 10.1371/journal.pcbi.1004566. eCollection 2015 Dec.
5
A Scalable Weight-Free Learning Algorithm for Regulatory Control of Cell Activity in Spiking Neuronal Networks.一种用于尖峰神经元网络中细胞活动调节控制的可扩展无权重学习算法。
Int J Neural Syst. 2018 Mar;28(2):1750015. doi: 10.1142/S0129065717500150. Epub 2016 Dec 22.
6
Reinforcement Learning of Linking and Tracing Contours in Recurrent Neural Networks.循环神经网络中轮廓连接与追踪的强化学习
PLoS Comput Biol. 2015 Oct 23;11(10):e1004489. doi: 10.1371/journal.pcbi.1004489. eCollection 2015 Oct.
7
Learning spatiotemporal signals using a recurrent spiking network that discretizes time.利用对时间进行离散化的循环尖峰网络来学习时空信号。
PLoS Comput Biol. 2020 Jan 21;16(1):e1007606. doi: 10.1371/journal.pcbi.1007606. eCollection 2020 Jan.
8
Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity.通过调节尖峰时间依赖性突触可塑性进行强化学习。
Neural Comput. 2007 Jun;19(6):1468-502. doi: 10.1162/neco.2007.19.6.1468.
9
An unsupervised STDP-based spiking neural network inspired by biologically plausible learning rules and connections.一种基于无监督 STDP 的尖峰神经网络,灵感来自于具有生物学合理性的学习规则和连接。
Neural Netw. 2023 Aug;165:799-808. doi: 10.1016/j.neunet.2023.06.019. Epub 2023 Jun 22.
10
Extreme neural machines.极端神经机器。
Neural Netw. 2021 Dec;144:639-647. doi: 10.1016/j.neunet.2021.09.021. Epub 2021 Oct 1.

引用本文的文献

1
A spiking neural network for active efficient coding.一种用于主动高效编码的脉冲神经网络。
Front Robot AI. 2025 Jan 15;11:1435197. doi: 10.3389/frobt.2024.1435197. eCollection 2024.
2
Incorporating structural plasticity into self-organization recurrent networks for sequence learning.将结构可塑性纳入用于序列学习的自组织递归网络。
Front Neurosci. 2023 Aug 1;17:1224752. doi: 10.3389/fnins.2023.1224752. eCollection 2023.
3
Training spiking neuronal networks to perform motor control using reinforcement and evolutionary learning.

本文引用的文献

1
Sleep prevents catastrophic forgetting in spiking neural networks by forming a joint synaptic weight representation.睡眠通过形成联合突触权重表示来防止尖峰神经网络中的灾难性遗忘。
PLoS Comput Biol. 2022 Nov 18;18(11):e1010628. doi: 10.1371/journal.pcbi.1010628. eCollection 2022 Nov.
2
The mouse cortico-basal ganglia-thalamic network.鼠大脑皮层-基底神经节-丘脑网络。
Nature. 2021 Oct;598(7879):188-194. doi: 10.1038/s41586-021-03993-3. Epub 2021 Oct 6.
3
A dopamine gradient controls access to distributed working memory in the large-scale monkey cortex.
利用强化学习和进化学习训练脉冲神经网络以执行运动控制。
Front Comput Neurosci. 2022 Sep 30;16:1017284. doi: 10.3389/fncom.2022.1017284. eCollection 2022.
4
Modernizing the NEURON Simulator for Sustainability, Portability, and Performance.使NEURON模拟器现代化,以实现可持续性、可移植性和高性能。
Front Neuroinform. 2022 Jun 27;16:884046. doi: 10.3389/fninf.2022.884046. eCollection 2022.
多巴胺梯度控制着大尺度猴脑内分布式工作记忆的获取。
Neuron. 2021 Nov 3;109(21):3500-3520.e13. doi: 10.1016/j.neuron.2021.08.024. Epub 2021 Sep 17.
4
Optimal plasticity for memory maintenance during ongoing synaptic change.在持续的突触变化过程中,记忆维持的最佳可塑性。
Elife. 2021 Sep 14;10:e62912. doi: 10.7554/eLife.62912.
5
Replay in Deep Learning: Current Approaches and Missing Biological Elements.深度学习中的再现:当前方法和缺失的生物学元素。
Neural Comput. 2021 Oct 12;33(11):2908-2950. doi: 10.1162/neco_a_01433.
6
Homeostatic synaptic scaling establishes the specificity of an associative memory.内稳态突触可塑性建立了联想记忆的特异性。
Curr Biol. 2021 Jun 7;31(11):2274-2285.e5. doi: 10.1016/j.cub.2021.03.024. Epub 2021 Apr 1.
7
First return, then explore.先返回,再探索。
Nature. 2021 Feb;590(7847):580-586. doi: 10.1038/s41586-020-03157-9. Epub 2021 Feb 24.
8
Frozen algorithms: how the brain's wiring facilitates learning.冻结的算法:大脑的布线如何促进学习。
Curr Opin Neurobiol. 2021 Apr;67:207-214. doi: 10.1016/j.conb.2020.12.017. Epub 2021 Jan 25.
9
Prioritized experience replays on a hippocampal predictive map for learning.优先体验回放对海马预测图的学习作用。
Proc Natl Acad Sci U S A. 2021 Jan 5;118(1). doi: 10.1073/pnas.2011266118. Epub 2020 Dec 18.
10
PsychRNN: An Accessible and Flexible Python Package for Training Recurrent Neural Network Models on Cognitive Tasks.PsychRNN:一个用于在认知任务上训练递归神经网络模型的易于访问和灵活的 Python 包。
eNeuro. 2021 Jan 15;8(1). doi: 10.1523/ENEURO.0427-20.2020. Print 2021 Jan-Feb.