Suppr超能文献

通过脉冲神经网络和非脉冲神经网络进行游戏时的自我控制。

Self-control with spiking and non-spiking neural networks playing games.

作者信息

Christodoulou Chris, Banfield Gaye, Cleanthous Aristodemos

机构信息

Department of Computer Science, University of Cyprus, 75 Kallipoleos Avenue, Nicosia, Cyprus.

出版信息

J Physiol Paris. 2010 May-Sep;104(3-4):108-17. doi: 10.1016/j.jphysparis.2009.11.013. Epub 2009 Nov 26.

Abstract

Self-control can be defined as choosing a large delayed reward over a small immediate reward, while precommitment is the making of a choice with the specific aim of denying oneself future choices. Humans recognise that they have self-control problems and attempt to overcome them by applying precommitment. Problems in exercising self-control, suggest a conflict between cognition and motivation, which has been linked to competition between higher and lower brain functions (representing the frontal lobes and the limbic system respectively). This premise of an internal process conflict, lead to a behavioural model being proposed, based on which, we implemented a computational model for studying and explaining self-control through precommitment behaviour. Our model consists of two neural networks, initially non-spiking and then spiking ones, representing the higher and lower brain systems viewed as cooperating for the benefit of the organism. The non-spiking neural networks are of simple feed forward multilayer type with reinforcement learning, one with selective bootstrap weight update rule, which is seen as myopic, representing the lower brain and the other with the temporal difference weight update rule, which is seen as far-sighted, representing the higher brain. The spiking neural networks are implemented with leaky integrate-and-fire neurons with learning based on stochastic synaptic transmission. The differentiating element between the two brain centres in this implementation is based on the memory of past actions determined by an eligibility trace time constant. As the structure of the self-control problem can be likened to the Iterated Prisoner's Dilemma (IPD) game in that cooperation is to defection what self-control is to impulsiveness or what compromising is to insisting, we implemented the neural networks as two players, learning simultaneously but independently, competing in the IPD game. With a technique resembling the precommitment effect, whereby the payoffs for the dilemma cases in the IPD payoff matrix are differentially biased (increased or decreased), it is shown that increasing the precommitment effect (through increasing the differential bias) increases the probability of cooperating with oneself in the future, irrespective of whether the implementation is with spiking or non-spiking neural networks.

摘要

自我控制可以被定义为选择一个较大的延迟奖励而非一个较小的即时奖励,而预先承诺则是做出一种选择,其特定目的是拒绝自己未来的选择。人类认识到他们存在自我控制问题,并试图通过应用预先承诺来克服这些问题。自我控制方面的问题表明认知与动机之间存在冲突,这与大脑高级和低级功能(分别代表额叶和边缘系统)之间的竞争有关。这种内部过程冲突的前提导致了一个行为模型的提出,基于此,我们实现了一个计算模型,用于通过预先承诺行为来研究和解释自我控制。我们的模型由两个神经网络组成,最初是非脉冲式的,然后是脉冲式的,分别代表被视为为了生物体的利益而协同工作的大脑高级和低级系统。非脉冲式神经网络是具有强化学习的简单前馈多层类型,一个具有选择性自举权重更新规则,被视为短视的,代表低级大脑,另一个具有时间差分权重更新规则,被视为有远见的,代表高级大脑。脉冲式神经网络是用基于随机突触传递进行学习的泄漏积分发放神经元实现的。在这个实现中,两个脑中心之间的区分元素基于由资格迹线时间常数确定的过去行动的记忆。由于自我控制问题的结构可以比作重复囚徒困境(IPD)博弈,即合作之于背叛就如同自我控制之于冲动或妥协之于坚持,我们将神经网络实现为两个参与者,同时但独立地学习,在IPD博弈中竞争。通过一种类似于预先承诺效应的技术,即IPD收益矩阵中困境情况的收益被有差异地偏向(增加或减少),结果表明增加预先承诺效应(通过增加差异偏向)会增加未来与自己合作的概率,无论实现方式是使用脉冲式还是非脉冲式神经网络。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验