刺激辨别能力可能会使基于价值的概率学习产生偏差。

Stimulus discriminability may bias value-based probabilistic learning.

作者信息

Schutte Iris, Slagter Heleen A, Collins Anne G E, Frank Michael J, Kenemans J Leon

机构信息

Department of Experimental Psychology and Psychopharmacology, Helmholtz Institute, Utrecht University, Utrecht, The Netherlands.

Department of Psychology and ABC, University of Amsterdam, Amsterdam, The Netherlands.

出版信息

PLoS One. 2017 May 8;12(5):e0176205. doi: 10.1371/journal.pone.0176205. eCollection 2017.

DOI:10.1371/journal.pone.0176205

PMID:28481915

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5421749/

Abstract

Reinforcement learning tasks are often used to assess participants' tendency to learn more from the positive or more from the negative consequences of one's action. However, this assessment often requires comparison in learning performance across different task conditions, which may differ in the relative salience or discriminability of the stimuli associated with more and less rewarding outcomes, respectively. To address this issue, in a first set of studies, participants were subjected to two versions of a common probabilistic learning task. The two versions differed with respect to the stimulus (Hiragana) characters associated with reward probability. The assignment of character to reward probability was fixed within version but reversed between versions. We found that performance was highly influenced by task version, which could be explained by the relative perceptual discriminability of characters assigned to high or low reward probabilities, as assessed by a separate discrimination experiment. Participants were more reliable in selecting rewarding characters that were more discriminable, leading to differences in learning curves and their sensitivity to reward probability. This difference in experienced reinforcement history was accompanied by performance biases in a test phase assessing ability to learn from positive vs. negative outcomes. In a subsequent large-scale web-based experiment, this impact of task version on learning and test measures was replicated and extended. Collectively, these findings imply a key role for perceptual factors in guiding reward learning and underscore the need to control stimulus discriminability when making inferences about individual differences in reinforcement learning.

摘要

强化学习任务通常用于评估参与者从自身行为的积极后果或消极后果中学习更多的倾向。然而，这种评估通常需要比较不同任务条件下的学习表现，而这些条件可能在与奖励程度或多或少相关的刺激的相对显著性或可辨别性方面存在差异。为了解决这个问题，在第一组研究中，参与者接受了一个常见概率学习任务的两个版本。这两个版本在与奖励概率相关的刺激（平假名）字符方面有所不同。字符与奖励概率的分配在版本内是固定的，但在不同版本之间是相反的。我们发现表现受到任务版本的高度影响，这可以通过在一个单独的辨别实验中评估的、分配给高或低奖励概率的字符的相对感知可辨别性来解释。参与者在选择更具可辨别性的奖励字符时更可靠，这导致了学习曲线及其对奖励概率的敏感性存在差异。这种经历的强化历史差异在评估从积极与消极结果中学习能力的测试阶段伴随着表现偏差。在随后的大规模基于网络的实验中，任务版本对学习和测试指标的这种影响得到了重复和扩展。总体而言，这些发现暗示了感知因素在指导奖励学习中的关键作用，并强调在推断强化学习中的个体差异时控制刺激可辨别性的必要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1a1c/5421749/ba9e2a42213d/pone.0176205.g001.jpg

相似文献

Stimulus discriminability may bias value-based probabilistic learning.

PLoS One. 2017 May 8;12(5):e0176205. doi: 10.1371/journal.pone.0176205. eCollection 2017.

Menstrual cycle phase modulates reward sensitivity and performance monitoring in young women: Preliminary fMRI evidence.

Neuropsychologia. 2016 Apr;84:70-80. doi: 10.1016/j.neuropsychologia.2015.10.016. Epub 2015 Oct 22.

Stimulus-dependent adjustment of reward prediction error in the midbrain.

PLoS One. 2011;6(12):e28337. doi: 10.1371/journal.pone.0028337. Epub 2011 Dec 2.

Individual differences in reinforcement learning: behavioral, electrophysiological, and neuroimaging correlates.

Neuroimage. 2008 Aug 15;42(2):807-16. doi: 10.1016/j.neuroimage.2008.05.032. Epub 2008 Jul 2.

Individual differences in learning from probabilistic reward and punishment predicts smoking status.

Addict Behav. 2019 Jan;88:73-76. doi: 10.1016/j.addbeh.2018.08.019. Epub 2018 Aug 19.

Perceptual Salience and Reward Both Influence Feedback-Related Neural Activity Arising from Choice.

J Neurosci. 2015 Sep 23;35(38):13064-75. doi: 10.1523/JNEUROSCI.1601-15.2015.

Selective reinforcement learning deficits in schizophrenia support predictions from computational models of striatal-cortical dysfunction.

Biol Psychiatry. 2007 Oct 1;62(7):756-64. doi: 10.1016/j.biopsych.2006.09.042. Epub 2007 Feb 14.

J Cogn Neurosci. 2004 Apr;16(3):463-78. doi: 10.1162/089892904322926791.

Implicit motivational value and salience are processed in distinct areas of orbitofrontal cortex.

Neuroimage. 2012 Sep;62(3):1717-25. doi: 10.1016/j.neuroimage.2012.06.016. Epub 2012 Jun 19.

The effects of high-frequency rTMS over the left dorsolateral prefrontal cortex on reward responsiveness.

Brain Stimul. 2013 May;6(3):310-4. doi: 10.1016/j.brs.2012.05.013. Epub 2012 Jun 19.

引用本文的文献

What's in a name: The role of verbalization in reinforcement learning.

Psychon Bull Rev. 2024 Dec;31(6):2746-2757. doi: 10.3758/s13423-024-02506-3. Epub 2024 May 20.

Confirmation Bias in the Course of Instructed Reinforcement Learning in Schizophrenia-Spectrum Disorders.

Brain Sci. 2022 Jan 11;12(1):90. doi: 10.3390/brainsci12010090.

Recovering Reliable Idiographic Biological Parameters from Noisy Behavioral Data: the Case of Basal Ganglia Indices in the Probabilistic Selection Task.

Comput Brain Behav. 2021;4(3):318-334. doi: 10.1007/s42113-021-00102-5. Epub 2021 Mar 24.

Effects of dopamine on reinforcement learning and consolidation in Parkinson's disease.

Elife. 2017 Jul 10;6:e26801. doi: 10.7554/eLife.26801.

本文引用的文献

Taking Psychiatry Research Online.

Neuron. 2016 Jul 6;91(1):19-23. doi: 10.1016/j.neuron.2016.06.002.

Perceptual Salience and Reward Both Influence Feedback-Related Neural Activity Arising from Choice.

J Neurosci. 2015 Sep 23;35(38):13064-75. doi: 10.1523/JNEUROSCI.1601-15.2015.

Striatal D1 and D2 signaling differentially predict learning from positive and negative outcomes.

Neuroimage. 2015 Apr 1;109:95-101. doi: 10.1016/j.neuroimage.2014.12.070. Epub 2015 Jan 3.

A reinforcement learning mechanism responsible for the valuation of free choice.

Neuron. 2014 Aug 6;83(3):551-7. doi: 10.1016/j.neuron.2014.06.035. Epub 2014 Jul 24.

Impulse control disorders in Parkinson's disease are associated with dysfunction in stimulus valuation but not action valuation.

J Neurosci. 2014 Jun 4;34(23):7814-24. doi: 10.1523/JNEUROSCI.4063-13.2014.

Neural oscillations and synchronization differentially support evidence accumulation in perceptual and value-based decision making.

Neuron. 2014 May 7;82(3):709-20. doi: 10.1016/j.neuron.2014.03.014.

Eye tracking and pupillometry are indicators of dissociable latent decision processes.

J Exp Psychol Gen. 2014 Aug;143(4):1476-88. doi: 10.1037/a0035813. Epub 2014 Feb 17.

The subtlety of distinctiveness: What von Restorff really did.

Psychon Bull Rev. 1995 Mar;2(1):105-12. doi: 10.3758/BF03214414.

Evaluating Amazon's Mechanical Turk as a tool for experimental behavioral research.

PLoS One. 2013;8(3):e57410. doi: 10.1371/journal.pone.0057410. Epub 2013 Mar 13.

Effects of asymmetric dopamine depletion on sensitivity to rewarding and aversive stimuli in Parkinson's disease.

Neuropsychologia. 2013 Apr;51(5):818-24. doi: 10.1016/j.neuropsychologia.2013.02.003. Epub 2013 Feb 17.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

刺激辨别能力可能会使基于价值的概率学习产生偏差。

Stimulus discriminability may bias value-based probabilistic learning.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献