恒河猴匹配行为中逐个反应的动态模型。

Dynamic response-by-response models of matching behavior in rhesus monkeys.

作者信息

Lau Brian, Glimcher Paul W

机构信息

Center for Neural Science, New York University, New York, New York 10003, USA.

出版信息

J Exp Anal Behav. 2005 Nov;84(3):555-79. doi: 10.1901/jeab.2005.110-04.

DOI:10.1901/jeab.2005.110-04

PMID:16596980

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1389781/

Abstract

We studied the choice behavior of 2 monkeys in a discrete-trial task with reinforcement contingencies similar to those Herrnstein (1961) used when he described the matching law. In each session, the monkeys experienced blocks of discrete trials at different relative-reinforcer frequencies or magnitudes with unsignalled transitions between the blocks. Steady-state data following adjustment to each transition were well characterized by the generalized matching law; response ratios undermatched reinforcer frequency ratios but matched reinforcer magnitude ratios. We modelled response-by-response behavior with linear models that used past reinforcers as well as past choices to predict the monkeys' choices on each trial. We found that more recently obtained reinforcers more strongly influenced choice behavior. Perhaps surprisingly, we also found that the monkeys' actions were influenced by the pattern of their own past choices. It was necessary to incorporate both past reinforcers and past choices in order to accurately capture steady-state behavior as well as the fluctuations during block transitions and the response-by-response patterns of behavior. Our results suggest that simple reinforcement learning models must account for the effects of past choices to accurately characterize behavior in this task, and that models with these properties provide a conceptual tool for studying how both past reinforcers and past choices are integrated by the neural systems that generate behavior.

摘要

我们研究了2只猴子在离散试验任务中的选择行为，该任务的强化条件类似于赫恩斯坦（1961年）描述匹配定律时所使用的条件。在每个实验环节中，猴子会经历不同相对强化频率或强度的离散试验块，且试验块之间的转换没有信号提示。在适应每个转换后的稳态数据可以很好地用广义匹配定律来描述；反应比率低于强化频率比率，但与强化强度比率相匹配。我们用线性模型对逐个反应的行为进行建模，该模型使用过去的强化物以及过去的选择来预测猴子在每次试验中的选择。我们发现，最近获得的强化物对选择行为的影响更强。也许令人惊讶的是，我们还发现猴子的行为受到其自身过去选择模式的影响。为了准确捕捉稳态行为以及试验块转换期间的波动和逐个反应的行为模式，有必要同时纳入过去的强化物和过去的选择。我们的结果表明，简单的强化学习模型必须考虑过去选择的影响，才能准确描述该任务中的行为，并且具有这些特性的模型为研究生成行为的神经系统如何整合过去的强化物和过去的选择提供了一个概念工具。

相似文献

Dynamic response-by-response models of matching behavior in rhesus monkeys.

J Exp Anal Behav. 2005 Nov;84(3):555-79. doi: 10.1901/jeab.2005.110-04.

Linear-Nonlinear-Poisson models of primate choice dynamics.

J Exp Anal Behav. 2005 Nov;84(3):581-617. doi: 10.1901/jeab.2005.23-05.

The generalized matching law as a predictor of choice between cocaine and food in rhesus monkeys.

Psychopharmacology (Berl). 2002 Oct;163(3-4):319-26. doi: 10.1007/s00213-002-1012-7. Epub 2002 Mar 1.

Remembering as discrimination in delayed matching to sample: discriminability and bias.

Learn Behav. 2007 Aug;35(3):177-83. doi: 10.3758/bf03193053.

Choice behavior in transition: development of preference for the higher probability of reinforcement.

J Exp Anal Behav. 1990 May;53(3):409-22. doi: 10.1901/jeab.1990.53-409.

Choice with probabilistic reinforcement: effects of delay and conditioned reinforcers.

J Exp Anal Behav. 1991 Jan;55(1):63-77. doi: 10.1901/jeab.1991.55-63.

Gestational exposure to methylmercury retards choice in transition in aging rats.

Neurotoxicol Teratol. 2004 Mar-Apr;26(2):179-94. doi: 10.1016/j.ntt.2003.12.004.

Effects of morphine sulfate on operant behavior in rhesus monkeys.

Pharmacol Biochem Behav. 1991 Jan;38(1):77-83. doi: 10.1016/0091-3057(91)90592-p.

Dynamical concurrent schedules.

J Exp Anal Behav. 2003 Jan;79(1):1-20. doi: 10.1901/jeab.2003.79-1.

Memory without awareness: pigeons do not show metamemory in delayed matching to sample.

J Exp Psychol Anim Behav Process. 2008 Apr;34(2):266-82. doi: 10.1037/0097-7403.34.2.266.

引用本文的文献

Brain-wide representations of prior information in mouse decision-making.

Nature. 2025 Sep;645(8079):192-200. doi: 10.1038/s41586-025-09226-1. Epub 2025 Sep 3.

A multidimensional distributional map of future reward in dopamine neurons.

Nature. 2025 Jun;642(8068):691-699. doi: 10.1038/s41586-025-09089-6. Epub 2025 Jun 4.

Stimulus uncertainty and relative reward rates determine adaptive responding in perceptual decision-making.

PLoS Comput Biol. 2025 May 27;21(5):e1012636. doi: 10.1371/journal.pcbi.1012636. eCollection 2025 May.

Striatal arbitration between choice strategies guides few-shot adaptation.

Nat Commun. 2025 Feb 20;16(1):1811. doi: 10.1038/s41467-025-57049-5.

Signatures of Perseveration and Heuristic-Based Directed Exploration in Two-Step Sequential Decision Task Behaviour.

Comput Psychiatr. 2025 Feb 11;9(1):39-62. doi: 10.5334/cpsy.101. eCollection 2025.

Contributions of Attention to Learning in Multidimensional Reward Environments.

J Neurosci. 2025 Feb 12;45(7):e2300232024. doi: 10.1523/JNEUROSCI.2300-23.2024.

Computational and Neural Evidence for Altered Fast and Slow Learning from Losses in Problem Gambling.

J Neurosci. 2025 Jan 1;45(1):e0080242024. doi: 10.1523/JNEUROSCI.0080-24.2024.

Pupillary responses to directional uncertainty while intercepting a moving target.

R Soc Open Sci. 2024 Oct 2;11(10):240606. doi: 10.1098/rsos.240606. eCollection 2024 Oct.

Sensory choices as logistic classification.

Neuron. 2024 Sep 4;112(17):2854-2868.e1. doi: 10.1016/j.neuron.2024.06.016. Epub 2024 Jul 15.

Sensory choices as logistic classification.

bioRxiv. 2024 Jun 27:2024.01.17.576029. doi: 10.1101/2024.01.17.576029.

本文引用的文献

Learning variable and stereotypical sequences of responses: Some data and a new model.

Behav Processes. 1993 Oct;30(2):103-29. doi: 10.1016/0376-6357(93)90002-9.

Choice, contingency discrimination, and foraging theory.

J Exp Anal Behav. 1999 May;71(3):355-73. doi: 10.1901/jeab.1999.71-355.

Investigating Behavioral Dynamics With A Fixed-time Extinction Schedule And Linear Analysis.

J Exp Anal Behav. 1996 Nov;66(3):391-409. doi: 10.1901/jeab.1996.66-391.

Applying linear systems analysis to dynamic behavior.

J Exp Anal Behav. 1992 May;57(3):377-91. doi: 10.1901/jeab.1992.57-377.

Determination of a behavioral transfer function: White-noise analysis of session-to-session response-ratio dynamics on concurrent VI VI schedules.

J Exp Anal Behav. 1985 Jan;43(1):43-59. doi: 10.1901/jeab.1985.43-43.

Hill-climbing by pigeons.

J Exp Anal Behav. 1983 Jan;39(1):25-47. doi: 10.1901/jeab.1983.39-25.

Optimal choice.

J Exp Anal Behav. 1981 May;35(3):397-412. doi: 10.1901/jeab.1981.35-397.

How to maximize reward rate on two variable-interval paradigms.

J Exp Anal Behav. 1981 May;35(3):367-96. doi: 10.1901/jeab.1981.35-367.

A Markov model description of changeover probabilities on concurrent variable-interval schedules.

J Exp Anal Behav. 1979 Jan;31(1):41-51. doi: 10.1901/jeab.1979.31-41.

Duration and rate of reinforcement as determinants of concurrent responding.

J Exp Anal Behav. 1977 Sep;28(2):145-53. doi: 10.1901/jeab.1977.28-145.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

恒河猴匹配行为中逐个反应的动态模型。

Dynamic response-by-response models of matching behavior in rhesus monkeys.

作者信息

Lau Brian, Glimcher Paul W

机构信息

Center for Neural Science, New York University, New York, New York 10003, USA.

出版信息

J Exp Anal Behav. 2005 Nov;84(3):555-79. doi: 10.1901/jeab.2005.110-04.

DOI:10.1901/jeab.2005.110-04

PMID:16596980

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1389781/

Abstract

摘要

恒河猴匹配行为中逐个反应的动态模型。

Dynamic response-by-response models of matching behavior in rhesus monkeys.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

恒河猴匹配行为中逐个反应的动态模型。

Dynamic response-by-response models of matching behavior in rhesus monkeys.

作者信息

机构信息

出版信息