考虑强化信号出现概率的强化学习算法对大鼠行为的模拟

Simulation of rat behavior by a reinforcement learning algorithm in consideration of appearance probabilities of reinforcement signals.

作者信息

Murakoshi Kazushi, Noguchi Takuya

机构信息

Department of Knowledge-based Information Engineering, Toyohashi University of Technology, 1-1 Hibarigaoka, Tenpaku-cho, Toyohashi 441-8580, Japan.

出版信息

Biosystems. 2005 Apr;80(1):83-90. doi: 10.1016/j.biosystems.2004.10.005. Epub 2004 Dec 8.

DOI:10.1016/j.biosystems.2004.10.005

PMID:15740837

Abstract

Brown and Wanger [Brown, R.T., Wanger, A.R., 1964. Resistance to punishment and extinction following training with shock or nonreinforcement. J. Exp. Psychol. 68, 503-507] investigated rat behaviors with the following features: (1) rats were exposed to reward and punishment at the same time, (2) environment changed and rats relearned, and (3) rats were stochastically exposed to reward and punishment. The results are that exposure to nonreinforcement produces resistance to the decremental effects of behavior after stochastic reward schedule and that exposure to both punishment and reinforcement produces resistance to the decremental effects of behavior after stochastic punishment schedule. This paper aims to simulate the rat behaviors by a reinforcement learning algorithm in consideration of appearance probabilities of reinforcement signals. The former algorithms of reinforcement learning were unable to simulate the behavior of the feature (3). We improve the former reinforcement learning algorithms by controlling learning parameters in consideration of the acquisition probabilities of reinforcement signals. The proposed algorithm qualitatively simulates the result of the animal experiment of Brown and Wanger.

摘要

布朗和万格[布朗，R.T.，万格，A.R.，1964年。电击或无强化训练后对惩罚和消退的抵抗。《实验心理学杂志》68卷，第503 - 507页]研究了具有以下特征的大鼠行为：（1）大鼠同时接受奖励和惩罚；（2）环境改变且大鼠重新学习；（3）大鼠随机接受奖励和惩罚。结果表明，无强化暴露会产生对随机奖励程序后行为递减效应的抵抗，而同时接受惩罚和强化暴露会产生对随机惩罚程序后行为递减效应的抵抗。本文旨在通过强化学习算法考虑强化信号的出现概率来模拟大鼠行为。以前的强化学习算法无法模拟特征（3）的行为。我们通过考虑强化信号的获取概率来控制学习参数，从而改进以前的强化学习算法。所提出的算法定性地模拟了布朗和万格动物实验的结果。

相似文献

Simulation of rat behavior by a reinforcement learning algorithm in consideration of appearance probabilities of reinforcement signals.考虑强化信号出现概率的强化学习算法对大鼠行为的模拟

Biosystems. 2005 Apr;80(1):83-90. doi: 10.1016/j.biosystems.2004.10.005. Epub 2004 Dec 8.

Extinction after regular and irregular reward schedules in the infant rat: influence of age and training duration.幼鼠在规律和不规律奖励时间表后的消退：年龄和训练持续时间的影响

Dev Psychobiol. 1999 Jan;34(1):57-70.

The partial reinforcement extinction effect (PREE) in female Roman high- (RHA-I) and low-avoidance (RLA-I) rats.雌性罗马高回避（RHA-I）和低回避（RLA-I）大鼠的部分强化消退效应（PREE）

Behav Brain Res. 2008 Dec 12;194(2):187-92. doi: 10.1016/j.bbr.2008.07.009. Epub 2008 Jul 18.

More rapid associative change with retraining than with initial training.与初始训练相比，再训练时联想变化更快。

J Exp Psychol Anim Behav Process. 2003 Oct;29(4):251-60. doi: 10.1037/0097-7403.29.4.251.

[Mathematical models of decision making and learning].[决策与学习的数学模型]

Brain Nerve. 2008 Jul;60(7):791-8.

The neurotensin receptor agonist NT69L suppresses sucrose-reinforced operant behavior in the rat.神经降压素受体激动剂NT69L可抑制大鼠蔗糖强化的操作性行为。

Brain Res. 2007 Jan 5;1127(1):90-8. doi: 10.1016/j.brainres.2006.10.025. Epub 2006 Nov 17.

The effects of acquisition training schedule on extinction and reinstatement of cocaine self-administration in male rats.习得训练时间表对雄性大鼠可卡因自我给药消退和恢复的影响。

Exp Clin Psychopharmacol. 2006 May;14(2):245-53. doi: 10.1037/1064-1297.14.2.245.

The role of different subregions of the basolateral amygdala in cue-induced reinstatement and extinction of food-seeking behavior.基底外侧杏仁核不同亚区在线索诱导的食物寻求行为恢复及消退中的作用。

Neuroscience. 2007 Jun 8;146(4):1484-94. doi: 10.1016/j.neuroscience.2007.03.025. Epub 2007 Apr 20.

Biological implementation of the temporal difference algorithm for reinforcement learning: theoretical comment on O'Reilly et al. (2007).强化学习中时间差分算法的生物学实现：对奥赖利等人（2007年）的理论评论。

Behav Neurosci. 2007 Feb;121(1):231-2. doi: 10.1037/0735-7044.121.1.231.

Reinforcement learning for a stochastic automaton modelling predation in stationary model-mimic environments.用于在平稳模型模拟环境中对捕食进行建模的随机自动机的强化学习。

Math Biosci. 2005 May;195(1):76-91. doi: 10.1016/j.mbs.2005.01.003.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

考虑强化信号出现概率的强化学习算法对大鼠行为的模拟

Simulation of rat behavior by a reinforcement learning algorithm in consideration of appearance probabilities of reinforcement signals.

作者信息

机构信息

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献