通过自利的类神经元计算元件的统计协作进行学习。

Learning by statistical cooperation of self-interested neuron-like computing elements.

作者信息

Barto A G

出版信息

Hum Neurobiol. 1985;4(4):229-56.

Abstract

Since the usual approaches to cooperative computation in networks of neuron-like computating elements do not assume that network components have any "preferences", they do not make substantive contact with game theoretic concepts, despite their use of some of the same terminology. In the approach presented here, however, each network component, or adaptive element, is a self-interested agent that prefers some inputs over others and "works" toward obtaining the most highly preferred inputs. Here we describe an adaptive element that is robust enough to learn to cooperate with other elements like itself in order to further its self-interests. It is argued that some of the longstanding problems concerning adaptation and learning by networks might be solvable by this form of cooperativity, and computer simulation experiments are described that show how networks of self-interested components that are sufficiently robust can solve rather difficult learning problems. We then place the approach in its proper historical and theoretical perspective through comparison with a number of related algorithms. A secondary aim of this article is to suggest that beyond what is explicitly illustrated here, there is a wealth of ideas from game theory and allied disciplines such as mathematical economics that can be of use in thinking about cooperative computation in both nervous systems and man-made systems.

摘要

由于在类神经元计算元件网络中进行协作计算的常用方法并不假定网络组件有任何“偏好”，所以尽管使用了一些相同的术语，但它们与博弈论概念并无实质性关联。然而，在此提出的方法中，每个网络组件或自适应元件都是一个自利的主体，它对某些输入的偏好超过其他输入，并“努力”获取最偏好的输入。这里我们描述一种自适应元件，它足够强大，能够学会与其他类似自身的元件合作以增进自身利益。有人认为，一些关于网络自适应和学习的长期问题可能通过这种合作形式得以解决，并且描述了计算机模拟实验，展示了足够强大的自利组件网络如何能够解决相当困难的学习问题。然后，我们通过与一些相关算法进行比较，将该方法置于恰当的历史和理论视角中。本文的第二个目的是表明，除了这里明确阐述的内容之外，博弈论以及数学经济学等相关学科还有大量思想，可用于思考神经系统和人造系统中的协作计算。

相似文献

Learning by statistical cooperation of self-interested neuron-like computing elements.通过自利的类神经元计算元件的统计协作进行学习。

Hum Neurobiol. 1985;4(4):229-56.

Closed-form expressions of some stochastic adapting equations for nonlinear adaptive activation function neurons.用于非线性自适应激活函数神经元的一些随机自适应方程的闭式表达式。

Neural Comput. 2003 Dec;15(12):2909-29. doi: 10.1162/089976603322518795.

A learning theory for reward-modulated spike-timing-dependent plasticity with application to biofeedback.一种用于奖励调制的依赖于尖峰时间的可塑性的学习理论及其在生物反馈中的应用。

PLoS Comput Biol. 2008 Oct;4(10):e1000180. doi: 10.1371/journal.pcbi.1000180. Epub 2008 Oct 10.

[Mechanisms of development of long-periodicity oscillations in activity in nerve nets. Stochastically uniform nerve nets].[神经网络活动中长周期振荡的发展机制。随机均匀神经网络]

Neirofiziologiia. 1986;18(3):382-91.

Modeling the neural substrates of associative learning and memory: a computational approach.模拟联想学习与记忆的神经基础：一种计算方法。

Psychol Rev. 1987 Apr;94(2):176-91.

Human learning and memory: connections and dissociations.人类学习与记忆：关联与分离

Annu Rev Psychol. 1990;41:109-39. doi: 10.1146/annurev.ps.41.020190.000545.

Learning spike-based population codes by reward and population feedback.通过奖励和种群反馈来学习基于尖峰的种群代码。

Neural Comput. 2010 Jul;22(7):1698-717. doi: 10.1162/neco.2010.05-09-1010.

Meta-learning approach to neural network optimization.元学习方法在神经网络优化中的应用。

Neural Netw. 2010 May;23(4):568-82. doi: 10.1016/j.neunet.2010.02.003. Epub 2010 Feb 20.

Neural networks: learning from a computer cat.神经网络：向电脑猫学习。

Nature. 1988 Feb 25;331(6158):657-9. doi: 10.1038/331657a0.

Self-control with spiking and non-spiking neural networks playing games.通过脉冲神经网络和非脉冲神经网络进行游戏时的自我控制。

J Physiol Paris. 2010 May-Sep;104(3-4):108-17. doi: 10.1016/j.jphysparis.2009.11.013. Epub 2009 Nov 26.

引用本文的文献

ANLN and KDR Are Jointly Prognostic of Breast Cancer Survival and Can Be Modulated for Triple Negative Breast Cancer Control.ANLN和KDR共同预测乳腺癌生存情况，且可对三阴性乳腺癌的控制进行调节。

Front Genet. 2019 Oct 4;10:790. doi: 10.3389/fgene.2019.00790. eCollection 2019.

Eligibility Traces and Plasticity on Behavioral Time Scales: Experimental Support of NeoHebbian Three-Factor Learning Rules.行为时间尺度上的资格痕迹和可塑性：新海比尔三因素学习规则的实验支持。

Front Neural Circuits. 2018 Jul 31;12:53. doi: 10.3389/fncir.2018.00053. eCollection 2018.

Reinforcement Learning of Linking and Tracing Contours in Recurrent Neural Networks.循环神经网络中轮廓连接与追踪的强化学习

PLoS Comput Biol. 2015 Oct 23;11(10):e1004489. doi: 10.1371/journal.pcbi.1004489. eCollection 2015 Oct.

Emergent structured transition from variation to repetition in a biologically-plausible model of learning in basal ganglia.在基底神经节学习的生物合理模型中，从变异到重复的紧急结构化转变。

Front Psychol. 2014 Feb 11;5:91. doi: 10.3389/fpsyg.2014.00091. eCollection 2014.

Democratic population decisions result in robust policy-gradient learning: a parametric study with GPU simulations.民主的人口决策导致强大的策略梯度学习：带有 GPU 模拟的参数研究。

PLoS One. 2011 May 4;6(5):e18539. doi: 10.1371/journal.pone.0018539.

Spike-based reinforcement learning in continuous state and action space: when policy gradient methods fail.基于尖峰的连续状态和动作空间中的强化学习：当策略梯度方法失败时。

PLoS Comput Biol. 2009 Dec;5(12):e1000586. doi: 10.1371/journal.pcbi.1000586. Epub 2009 Dec 4.

Neural networks for perceptual processing: from simulation tools to theories.用于感知处理的神经网络：从模拟工具到理论

Philos Trans R Soc Lond B Biol Sci. 2007 Mar 29;362(1479):339-53. doi: 10.1098/rstb.2006.1962.

Connectionist models of conditioning: A tutorial.连接主义条件作用模型：教程。

J Exp Anal Behav. 1989 Nov;52(3):427-40. doi: 10.1901/jeab.1989.52-427.

Connectionistic models of Boolean category representation.

Biol Cybern. 1986;54(6):393-406. doi: 10.1007/BF00355545.

Disjunctive models of Boolean category learning.布尔范畴学习的析取模型。

Biol Cybern. 1987;56(2-3):121-37. doi: 10.1007/BF00317987.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

通过自利的类神经元计算元件的统计协作进行学习。

Learning by statistical cooperation of self-interested neuron-like computing elements.

作者信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献