• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过脉冲神经网络和非脉冲神经网络进行游戏时的自我控制。

Self-control with spiking and non-spiking neural networks playing games.

作者信息

Christodoulou Chris, Banfield Gaye, Cleanthous Aristodemos

机构信息

Department of Computer Science, University of Cyprus, 75 Kallipoleos Avenue, Nicosia, Cyprus.

出版信息

J Physiol Paris. 2010 May-Sep;104(3-4):108-17. doi: 10.1016/j.jphysparis.2009.11.013. Epub 2009 Nov 26.

DOI:10.1016/j.jphysparis.2009.11.013
PMID:19944157
Abstract

Self-control can be defined as choosing a large delayed reward over a small immediate reward, while precommitment is the making of a choice with the specific aim of denying oneself future choices. Humans recognise that they have self-control problems and attempt to overcome them by applying precommitment. Problems in exercising self-control, suggest a conflict between cognition and motivation, which has been linked to competition between higher and lower brain functions (representing the frontal lobes and the limbic system respectively). This premise of an internal process conflict, lead to a behavioural model being proposed, based on which, we implemented a computational model for studying and explaining self-control through precommitment behaviour. Our model consists of two neural networks, initially non-spiking and then spiking ones, representing the higher and lower brain systems viewed as cooperating for the benefit of the organism. The non-spiking neural networks are of simple feed forward multilayer type with reinforcement learning, one with selective bootstrap weight update rule, which is seen as myopic, representing the lower brain and the other with the temporal difference weight update rule, which is seen as far-sighted, representing the higher brain. The spiking neural networks are implemented with leaky integrate-and-fire neurons with learning based on stochastic synaptic transmission. The differentiating element between the two brain centres in this implementation is based on the memory of past actions determined by an eligibility trace time constant. As the structure of the self-control problem can be likened to the Iterated Prisoner's Dilemma (IPD) game in that cooperation is to defection what self-control is to impulsiveness or what compromising is to insisting, we implemented the neural networks as two players, learning simultaneously but independently, competing in the IPD game. With a technique resembling the precommitment effect, whereby the payoffs for the dilemma cases in the IPD payoff matrix are differentially biased (increased or decreased), it is shown that increasing the precommitment effect (through increasing the differential bias) increases the probability of cooperating with oneself in the future, irrespective of whether the implementation is with spiking or non-spiking neural networks.

摘要

自我控制可以被定义为选择一个较大的延迟奖励而非一个较小的即时奖励,而预先承诺则是做出一种选择,其特定目的是拒绝自己未来的选择。人类认识到他们存在自我控制问题,并试图通过应用预先承诺来克服这些问题。自我控制方面的问题表明认知与动机之间存在冲突,这与大脑高级和低级功能(分别代表额叶和边缘系统)之间的竞争有关。这种内部过程冲突的前提导致了一个行为模型的提出,基于此,我们实现了一个计算模型,用于通过预先承诺行为来研究和解释自我控制。我们的模型由两个神经网络组成,最初是非脉冲式的,然后是脉冲式的,分别代表被视为为了生物体的利益而协同工作的大脑高级和低级系统。非脉冲式神经网络是具有强化学习的简单前馈多层类型,一个具有选择性自举权重更新规则,被视为短视的,代表低级大脑,另一个具有时间差分权重更新规则,被视为有远见的,代表高级大脑。脉冲式神经网络是用基于随机突触传递进行学习的泄漏积分发放神经元实现的。在这个实现中,两个脑中心之间的区分元素基于由资格迹线时间常数确定的过去行动的记忆。由于自我控制问题的结构可以比作重复囚徒困境(IPD)博弈,即合作之于背叛就如同自我控制之于冲动或妥协之于坚持,我们将神经网络实现为两个参与者,同时但独立地学习,在IPD博弈中竞争。通过一种类似于预先承诺效应的技术,即IPD收益矩阵中困境情况的收益被有差异地偏向(增加或减少),结果表明增加预先承诺效应(通过增加差异偏向)会增加未来与自己合作的概率,无论实现方式是使用脉冲式还是非脉冲式神经网络。

相似文献

1
Self-control with spiking and non-spiking neural networks playing games.通过脉冲神经网络和非脉冲神经网络进行游戏时的自我控制。
J Physiol Paris. 2010 May-Sep;104(3-4):108-17. doi: 10.1016/j.jphysparis.2009.11.013. Epub 2009 Nov 26.
2
Spiking neural networks with different reinforcement learning (RL) schemes in a multiagent setting.在多智能体环境中采用不同强化学习(RL)方案的脉冲神经网络。
Chin J Physiol. 2010 Dec 31;53(6):447-53.
3
Multiagent reinforcement learning: spiking and nonspiking agents in the iterated Prisoner's Dilemma.多智能体强化学习:重复囚徒困境中的脉冲式和非脉冲式智能体
IEEE Trans Neural Netw. 2011 Apr;22(4):639-53. doi: 10.1109/TNN.2011.2111384. Epub 2011 Mar 17.
4
Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity.通过调节尖峰时间依赖性突触可塑性进行强化学习。
Neural Comput. 2007 Jun;19(6):1468-502. doi: 10.1162/neco.2007.19.6.1468.
5
A spiking neural network model of an actor-critic learning agent.一种基于演员-评论家学习智能体的脉冲神经网络模型。
Neural Comput. 2009 Feb;21(2):301-39. doi: 10.1162/neco.2008.08-07-593.
6
Spontaneous dynamics of asymmetric random recurrent spiking neural networks.非对称随机递归脉冲神经网络的自发动力学
Neural Comput. 2006 Jan;18(1):60-79. doi: 10.1162/089976606774841567.
7
Reinforcement learning, spike-time-dependent plasticity, and the BCM rule.强化学习、尖峰时间依赖性可塑性与BCM规则。
Neural Comput. 2007 Aug;19(8):2245-79. doi: 10.1162/neco.2007.19.8.2245.
8
A theoretical analysis of temporal difference learning in the iterated prisoner's dilemma game.在迭代囚徒困境博弈中对时间差分学习的理论分析。
Bull Math Biol. 2009 Nov;71(8):1818-50. doi: 10.1007/s11538-009-9424-8. Epub 2009 May 29.
9
Supervised learning in spiking neural networks with ReSuMe: sequence learning, classification, and spike shifting.基于 ReSuMe 的尖峰神经网络监督学习:序列学习、分类和尖峰转移。
Neural Comput. 2010 Feb;22(2):467-510. doi: 10.1162/neco.2009.11-08-901.
10
Bayesian spiking neurons II: learning.贝叶斯脉冲神经元II:学习
Neural Comput. 2008 Jan;20(1):118-45. doi: 10.1162/neco.2008.20.1.118.

引用本文的文献

1
The effectiveness of revocable precommitment strategies in reducing decision-making impulsivity.可撤销预先承诺策略在降低决策冲动性方面的有效性。
Soc Cogn Affect Neurosci. 2024 Dec 13;19(1). doi: 10.1093/scan/nsae093.
2
GPCRs Are Optimal Regulators of Complex Biological Systems and Orchestrate the Interface between Health and Disease.G 蛋白偶联受体是复杂生物系统的最佳调节者,协调着健康和疾病之间的界面。
Int J Mol Sci. 2021 Dec 13;22(24):13387. doi: 10.3390/ijms222413387.
3
Functional signaling biases in G protein-coupled receptors: Game Theory and receptor dynamics.
G 蛋白偶联受体的功能信号偏置:博弈论和受体动力学。
Mini Rev Med Chem. 2012 Aug;12(9):831-40. doi: 10.2174/138955712800959071.