民主的人口决策导致强大的策略梯度学习：带有 GPU 模拟的参数研究。

Democratic population decisions result in robust policy-gradient learning: a parametric study with GPU simulations.

机构信息

Department of Computer Science, University of Sheffield, Sheffield, United Kingdom.

出版信息

PLoS One. 2011 May 4;6(5):e18539. doi: 10.1371/journal.pone.0018539.

DOI:10.1371/journal.pone.0018539

PMID:21572529

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3087717/

Abstract

High performance computing on the Graphics Processing Unit (GPU) is an emerging field driven by the promise of high computational power at a low cost. However, GPU programming is a non-trivial task and moreover architectural limitations raise the question of whether investing effort in this direction may be worthwhile. In this work, we use GPU programming to simulate a two-layer network of Integrate-and-Fire neurons with varying degrees of recurrent connectivity and investigate its ability to learn a simplified navigation task using a policy-gradient learning rule stemming from Reinforcement Learning. The purpose of this paper is twofold. First, we want to support the use of GPUs in the field of Computational Neuroscience. Second, using GPU computing power, we investigate the conditions under which the said architecture and learning rule demonstrate best performance. Our work indicates that networks featuring strong Mexican-Hat-shaped recurrent connections in the top layer, where decision making is governed by the formation of a stable activity bump in the neural population (a "non-democratic" mechanism), achieve mediocre learning results at best. In absence of recurrent connections, where all neurons "vote" independently ("democratic") for a decision via population vector readout, the task is generally learned better and more robustly. Our study would have been extremely difficult on a desktop computer without the use of GPU programming. We present the routines developed for this purpose and show that a speed improvement of 5x up to 42x is provided versus optimised Python code. The higher speed is achieved when we exploit the parallelism of the GPU in the search of learning parameters. This suggests that efficient GPU programming can significantly reduce the time needed for simulating networks of spiking neurons, particularly when multiple parameter configurations are investigated.

摘要

基于图形处理单元 (GPU) 的高性能计算是一个新兴领域，其承诺以低成本提供高计算能力。然而，GPU 编程是一项非平凡的任务，而且架构限制提出了这样一个问题，即投入精力朝这个方向发展是否值得。在这项工作中，我们使用 GPU 编程来模拟具有不同程度递归连接的两层积分点火神经元网络，并使用源自强化学习的策略梯度学习规则来研究其学习简化导航任务的能力。本文的目的有两个。首先，我们希望支持在计算神经科学领域使用 GPU。其次，使用 GPU 计算能力，我们研究了在什么条件下，上述架构和学习规则可以表现出最佳性能。我们的工作表明，在顶层具有强墨西哥帽形递归连接的网络中，决策由神经元群体中稳定活动峰的形成（“非民主”机制）来控制，其学习效果充其量只是中等。在没有递归连接的情况下，所有神经元通过群体向量读出独立地（“民主”）对决策进行投票，任务通常可以更好、更稳健地学习。如果不使用 GPU 编程，我们在台式计算机上进行这项研究将极其困难。我们介绍了为此目的开发的例程，并展示了与经过优化的 Python 代码相比，速度提高了 5 倍至 42 倍。当我们在搜索学习参数时利用 GPU 的并行性时，速度会更快。这表明，高效的 GPU 编程可以大大减少模拟尖峰神经元网络所需的时间，特别是在研究多个参数配置时。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/06e6/3087717/81b2c0ba11b3/pone.0018539.g001.jpg

相似文献

Democratic population decisions result in robust policy-gradient learning: a parametric study with GPU simulations.民主的人口决策导致强大的策略梯度学习：带有 GPU 模拟的参数研究。

PLoS One. 2011 May 4;6(5):e18539. doi: 10.1371/journal.pone.0018539.

Streaming parallel GPU acceleration of large-scale filter-based spiking neural networks.基于滤波器的大规模尖峰神经网络的流式并行 GPU 加速。

Network. 2012;23(4):183-211. doi: 10.3109/0954898X.2012.733842. Epub 2012 Oct 25.

PyGeNN: A Python Library for GPU-Enhanced Neural Networks.PyGeNN：用于GPU加速神经网络的Python库。

Front Neuroinform. 2021 Apr 22;15:659005. doi: 10.3389/fninf.2021.659005. eCollection 2021.

NMF-mGPU: non-negative matrix factorization on multi-GPU systems.NMF-mGPU：多GPU系统上的非负矩阵分解

BMC Bioinformatics. 2015 Feb 13;16:43. doi: 10.1186/s12859-015-0485-4.

A configurable simulation environment for the efficient simulation of large-scale spiking neural networks on graphics processors.一种可配置的模拟环境，用于在图形处理器上高效模拟大规模脉冲神经网络。

Neural Netw. 2009 Jul-Aug;22(5-6):791-800. doi: 10.1016/j.neunet.2009.06.028. Epub 2009 Jul 2.

LASSIE: simulating large-scale models of biochemical systems on GPUs.LASSIE：在图形处理器上模拟生化系统的大规模模型。

BMC Bioinformatics. 2017 May 10;18(1):246. doi: 10.1186/s12859-017-1666-0.

Parallel beamlet dose calculation via beamlet contexts in a distributed multi-GPU framework.基于分布式多 GPU 框架中的束流子区域进行平行束流子剂量计算。

Med Phys. 2019 Aug;46(8):3719-3733. doi: 10.1002/mp.13651. Epub 2019 Jun 30.

A nonvoxel-based dose convolution/superposition algorithm optimized for scalable GPU architectures.一种针对可扩展GPU架构进行优化的基于非体素的剂量卷积/叠加算法。

Med Phys. 2014 Oct;41(10):101711. doi: 10.1118/1.4895822.

Integrative multicellular biological modeling: a case study of 3D epidermal development using GPU algorithms.整合多细胞生物建模：使用GPU算法对三维表皮发育的案例研究。

BMC Syst Biol. 2010 Aug 9;4:107. doi: 10.1186/1752-0509-4-107.

An evaluation of multiple feed-forward networks on GPUs.基于 GPU 的多个前馈神经网络评估。

Int J Neural Syst. 2011 Feb;21(1):31-47. doi: 10.1142/S0129065711002638.

引用本文的文献

Abstract concept learning in a simple neural network inspired by the insect brain.受昆虫大脑启发的简单神经网络中的抽象概念学习。

PLoS Comput Biol. 2018 Sep 17;14(9):e1006435. doi: 10.1371/journal.pcbi.1006435. eCollection 2018 Sep.

A computational model of the integration of landmarks and motion in the insect central complex.昆虫中央复合体中地标与运动整合的计算模型。

PLoS One. 2017 Feb 27;12(2):e0172325. doi: 10.1371/journal.pone.0172325. eCollection 2017.

Detailed classification of swimming paths in the Morris Water Maze: multiple strategies within one trial.莫里斯水迷宫中游泳路径的详细分类：一次试验中的多种策略

Sci Rep. 2015 Oct 1;5:14562. doi: 10.1038/srep14562.

Decision-making and action selection in insects: inspiration from vertebrate-based theories.昆虫的决策与行动选择：源自基于脊椎动物理论的启发。

Front Behav Neurosci. 2015 Aug 18;9:216. doi: 10.3389/fnbeh.2015.00216. eCollection 2015.

Adaptation of short-term plasticity parameters via error-driven learning may explain the correlation between activity-dependent synaptic properties, connectivity motifs and target specificity.通过误差驱动学习来调整短期可塑性参数，可能可以解释活动依赖性突触特性、连接模式和目标特异性之间的相关性。

Front Comput Neurosci. 2015 Jan 29;8:175. doi: 10.3389/fncom.2014.00175. eCollection 2014.

Measuring symmetry, asymmetry and randomness in neural network connectivity.测量神经网络连接中的对称性、不对称性和随机性。

PLoS One. 2014 Jul 9;9(7):e100805. doi: 10.1371/journal.pone.0100805. eCollection 2014.

Emergence of connectivity motifs in networks of model neurons with short- and long-term plastic synapses.具有短期和长期可塑性突触的模型神经元网络中连接基序的出现。

PLoS One. 2014 Jan 15;9(1):e84626. doi: 10.1371/journal.pone.0084626. eCollection 2014.

本文引用的文献

Compensating Inhomogeneities of Neuromorphic VLSI Devices Via Short-Term Synaptic Plasticity.通过短期突触可塑性补偿神经形态 VLSI 器件的非均质性。

Front Comput Neurosci. 2010 Oct 8;4:129. doi: 10.3389/fncom.2010.00129. eCollection 2010.

Code generation: a strategy for neural network simulators.代码生成：神经网络模拟器的一种策略。

Neuroinformatics. 2010 Oct;8(3):183-96. doi: 10.1007/s12021-010-9082-x.

Pavlovian-instrumental interaction in 'observing behavior'.观察行为中的巴甫洛夫-工具性条件反射相互作用。

PLoS Comput Biol. 2010 Sep 9;6(9):e1000903. doi: 10.1371/journal.pcbi.1000903.

Reinforcement learning on slow features of high-dimensional input streams.高维输入流的慢特征的强化学习。

PLoS Comput Biol. 2010 Aug 19;6(8):e1000894. doi: 10.1371/journal.pcbi.1000894.

Learning spike-based population codes by reward and population feedback.通过奖励和种群反馈来学习基于尖峰的种群代码。

Neural Comput. 2010 Jul;22(7):1698-717. doi: 10.1162/neco.2010.05-09-1010.

Automatic fitting of spiking neuron models to electrophysiological recordings.尖峰神经元模型的自动拟合到电生理记录。

Front Neuroinform. 2010 Mar 5;4:2. doi: 10.3389/neuro.11.002.2010. eCollection 2010.

Trends in programming languages for neuroscience simulations.神经科学模拟编程语言的发展趋势。

Front Neurosci. 2009 Dec 15;3(3):374-80. doi: 10.3389/neuro.01.036.2009. eCollection 2009.

High performance cellular level agent-based simulation with FLAME for the GPU.基于 FLAME 的 GPU 上的高性能细胞级基于代理的模拟。

Brief Bioinform. 2010 May;11(3):334-47. doi: 10.1093/bib/bbp073. Epub 2010 Feb 1.

The brian simulator.大脑模拟器。（注：原文中“brian”拼写错误，正确应为“brain”）

Front Neurosci. 2009 Sep 15;3(2):192-7. doi: 10.3389/neuro.01.026.2009. eCollection 2009 Sep.

Spike-based reinforcement learning in continuous state and action space: when policy gradient methods fail.基于尖峰的连续状态和动作空间中的强化学习：当策略梯度方法失败时。

PLoS Comput Biol. 2009 Dec;5(12):e1000586. doi: 10.1371/journal.pcbi.1000586. Epub 2009 Dec 4.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

民主的人口决策导致强大的策略梯度学习：带有 GPU 模拟的参数研究。

Democratic population decisions result in robust policy-gradient learning: a parametric study with GPU simulations.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献