Suppr超能文献

在感知决策中,奖励和刺激信息的动态整合。

Dynamic integration of reward and stimulus information in perceptual decision-making.

机构信息

Department of Psychology, Stanford University, Stanford, California, United States of America.

出版信息

PLoS One. 2011 Mar 3;6(3):e16749. doi: 10.1371/journal.pone.0016749.

Abstract

In perceptual decision-making, ideal decision-makers should bias their choices toward alternatives associated with larger rewards, and the extent of the bias should decrease as stimulus sensitivity increases. When responses must be made at different times after stimulus onset, stimulus sensitivity grows with time from zero to a final asymptotic level. Are decision makers able to produce responses that are more biased if they are made soon after stimulus onset, but less biased if they are made after more evidence has been accumulated? If so, how close to optimal can they come in doing this, and how might their performance be achieved mechanistically? We report an experiment in which the payoff for each alternative is indicated before stimulus onset. Processing time is controlled by a "go" cue occurring at different times post stimulus onset, requiring a response within msec. Reward bias does start high when processing time is short and decreases as sensitivity increases, leveling off at a non-zero value. However, the degree of bias is sub-optimal for shorter processing times. We present a mechanistic account of participants' performance within the framework of the leaky competing accumulator model [1], in which accumulators for each alternative accumulate noisy information subject to leakage and mutual inhibition. The leveling off of accuracy is attributed to mutual inhibition between the accumulators, allowing the accumulator that gathers the most evidence early in a trial to suppress the alternative. Three ways reward might affect decision making in this framework are considered. One of the three, in which reward affects the starting point of the evidence accumulation process, is consistent with the qualitative pattern of the observed reward bias effect, while the other two are not. Incorporating this assumption into the leaky competing accumulator model, we are able to provide close quantitative fits to individual participant data.

摘要

在感知决策中,理想的决策者应该偏向于与较大奖励相关的选择,并且这种偏向的程度应该随着刺激敏感性的增加而减小。当响应必须在刺激后不同时间做出时,刺激敏感性从零时随时间增长到最终渐近水平。如果决策者在刺激后尽快做出响应,他们是否能够做出更偏向的响应,而如果他们在积累了更多证据后做出响应,则响应的偏向性较小?如果是这样,他们在这样做时能接近最佳状态,他们的表现可能会以何种机制实现?我们报告了一项实验,其中在刺激前就表明了每个选择的回报。处理时间由刺激后不同时间出现的“去”提示控制,需要在 msec 内做出响应。当处理时间较短时,奖励偏向性开始较高,随着敏感性的增加而减小,在非零值处趋于平稳。然而,对于较短的处理时间,偏向性的程度是次优的。我们在“泄漏竞争累加器模型”[1]的框架内提出了一种参与者表现的机制解释,其中每个选择的累加器积累受泄漏和相互抑制影响的噪声信息。准确性的平稳归因于累加器之间的相互抑制,这使得在试验早期收集最多证据的累加器能够抑制另一个累加器。考虑了奖励在这种框架下影响决策的三种方式。其中一种方式是奖励影响证据积累过程的起点,这与观察到的奖励偏向效应的定性模式一致,而其他两种方式则不一致。将此假设纳入泄漏竞争累加器模型中,我们能够对个体参与者数据进行接近定量的拟合。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验