双选择决策中的奖励率优化：理论预测的实证检验。

Reward rate optimization in two-alternative decision making: empirical tests of theoretical predictions.

机构信息

Princeton Neuroscience Institute, Princeton University, USA.

出版信息

J Exp Psychol Hum Percept Perform. 2009 Dec;35(6):1865-97. doi: 10.1037/a0016926.

DOI:10.1037/a0016926

PMID:19968441

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2791916/

Abstract

The drift-diffusion model (DDM) implements an optimal decision procedure for stationary, 2-alternative forced-choice tasks. The height of a decision threshold applied to accumulating information on each trial determines a speed-accuracy tradeoff (SAT) for the DDM, thereby accounting for a ubiquitous feature of human performance in speeded response tasks. However, little is known about how participants settle on particular tradeoffs. One possibility is that they select SATs that maximize a subjective rate of reward earned for performance. For the DDM, there exist unique, reward-rate-maximizing values for its threshold and starting point parameters in free-response tasks that reward correct responses (R. Bogacz, E. Brown, J. Moehlis, P. Holmes, & J. D. Cohen, 2006). These optimal values vary as a function of response-stimulus interval, prior stimulus probability, and relative reward magnitude for correct responses. We tested the resulting quantitative predictions regarding response time, accuracy, and response bias under these task manipulations and found that grouped data conformed well to the predictions of an optimally parameterized DDM.

摘要

漂移-扩散模型（DDM）为静态、二择一强制选择任务实施了最优决策程序。在每个试验上积累信息时应用的决策阈值的高度决定了 DDM 的速度-准确性权衡（SAT），从而解释了人类在快速反应任务中的普遍表现特征。然而，参与者如何确定特定的权衡取舍知之甚少。一种可能性是他们选择 SAT，以使表现获得的主观奖励率最大化。对于 DDM，在奖励正确反应的自由反应任务中，其阈值和起始点参数存在唯一的、奖励率最大化的值（R. Bogacz、E. Brown、J. Moehlis、P. Holmes 和 J. D. Cohen，2006）。这些最优值随反应-刺激间隔、先验刺激概率和正确反应的相对奖励幅度而变化。我们根据这些任务操作检验了关于反应时、准确性和反应偏差的定量预测，发现分组数据与最优参数化 DDM 的预测非常吻合。

相似文献

Reward rate optimization in two-alternative decision making: empirical tests of theoretical predictions.双选择决策中的奖励率优化：理论预测的实证检验。

J Exp Psychol Hum Percept Perform. 2009 Dec;35(6):1865-97. doi: 10.1037/a0016926.

Do humans produce the speed-accuracy trade-off that maximizes reward rate?人类是否会产生使奖励率最大化的速度-准确性权衡？

Q J Exp Psychol (Hove). 2010 May;63(5):863-91. doi: 10.1080/17470210903091643. Epub 2009 Sep 10.

Optimal decision making in neural inhibition models.神经抑制模型中的最优决策。

Psychol Rev. 2012 Jan;119(1):201-15. doi: 10.1037/a0026275. Epub 2011 Nov 21.

Rapid decision threshold modulation by reward rate in a neural network.神经网络中奖励率对快速决策阈值的调制

Neural Netw. 2006 Oct;19(8):1013-26. doi: 10.1016/j.neunet.2006.05.038. Epub 2006 Sep 20.

Acquisition of decision making criteria: reward rate ultimately beats accuracy.决策标准的获取：奖励率最终胜过准确性。

Atten Percept Psychophys. 2011 Feb;73(2):640-57. doi: 10.3758/s13414-010-0049-7.

Explicit melioration by a neural diffusion model.神经扩散模型的显式改进。

Brain Res. 2009 Nov 24;1299:95-117. doi: 10.1016/j.brainres.2009.07.017. Epub 2009 Jul 30.

Drift diffusion model of reward and punishment learning in schizophrenia: Modeling and experimental data.精神分裂症中奖惩学习的漂移扩散模型：建模与实验数据

Behav Brain Res. 2015 Sep 15;291:147-154. doi: 10.1016/j.bbr.2015.05.024. Epub 2015 May 22.

Drift-Diffusion Model Reveals Impaired Reward-Based Perceptual Decision-Making Processes Associated with Depression in Late Childhood and Early Adolescent Girls.漂移-扩散模型揭示了与晚期儿童和青少年女性抑郁相关的受损基于奖励的感知决策过程。

Res Child Adolesc Psychopathol. 2022 Nov;50(11):1515-1528. doi: 10.1007/s10802-022-00936-y. Epub 2022 Jun 9.

The physics of optimal decision making: a formal analysis of models of performance in two-alternative forced-choice tasks.最优决策的物理学：对二选一强制选择任务中表现模型的形式分析。

Psychol Rev. 2006 Oct;113(4):700-65. doi: 10.1037/0033-295X.113.4.700.

Single-trial dynamics explain magnitude sensitive decision making.单次试验动力学解释了量级敏感决策。

BMC Neurosci. 2018 Sep 10;19(1):54. doi: 10.1186/s12868-018-0457-5.

引用本文的文献

Time and memory costs jointly determine a speed-accuracy trade-off and set-size effects.时间和记忆成本共同决定了速度-准确性权衡和集合大小效应。

J Exp Psychol Gen. 2025 Jun;154(6):1611-1627. doi: 10.1037/xge0001760. Epub 2025 Apr 7.

People are at least as good at optimizing reward rate under equivalent fixed-trial compared to fixed-time conditions.与固定时间条件相比，在等效的固定试验条件下，人们至少同样擅长优化奖励率。

Psychon Bull Rev. 2025 Apr 3. doi: 10.3758/s13423-025-02680-y.

Mutual inclusivity improves decision-making by smoothing out choice's competitive edge.相互包容通过消除选择的竞争优势来改善决策。

Nat Hum Behav. 2025 Mar;9(3):521-533. doi: 10.1038/s41562-024-02064-7. Epub 2024 Dec 20.

Trial-history biases in evidence accumulation can give rise to apparent lapses in decision-making.在证据积累过程中，试验历史偏倚可能导致决策出现明显失误。

Nat Commun. 2024 Jan 22;15(1):662. doi: 10.1038/s41467-024-44880-5.

Neural Representations of Post-Decision Accuracy and Reward Expectation in the Caudate Nucleus and Frontal Eye Field.纹状体和额眼区中决策后准确性和奖励预期的神经表示。

J Neurosci. 2024 Jan 10;44(2):e0902232023. doi: 10.1523/JNEUROSCI.0902-23.2023.

Modelling decision-making biases.决策偏差建模

Front Comput Neurosci. 2023 Oct 20;17:1222924. doi: 10.3389/fncom.2023.1222924. eCollection 2023.

Predictions and rewards affect decision-making but not subjective experience.预测和奖励会影响决策，但不会影响主观体验。

Proc Natl Acad Sci U S A. 2023 Oct 31;120(44):e2220749120. doi: 10.1073/pnas.2220749120. Epub 2023 Oct 25.

Contributions of the Basal Ganglia to Visual Perceptual Decisions.基底神经节对视觉感知决策的贡献。

Annu Rev Vis Sci. 2023 Sep 15;9:385-407. doi: 10.1146/annurev-vision-111022-123804.

Humans reconfigure target and distractor processing to address distinct task demands.人类重新配置目标和干扰处理，以满足不同的任务需求。

Psychol Rev. 2024 Mar;131(2):349-372. doi: 10.1037/rev0000442. Epub 2023 Sep 4.

Visuo-vestibular heading perception: a model system to study multi-sensory decision making.视-前庭头动感知：用于研究多感觉决策的模型系统。

Philos Trans R Soc Lond B Biol Sci. 2023 Sep 25;378(1886):20220334. doi: 10.1098/rstb.2022.0334. Epub 2023 Aug 7.

本文引用的文献

Robust versus optimal strategies for two-alternative forced choice tasks.用于二选一强制选择任务的稳健策略与最优策略

J Math Psychol. 2010 Apr 1;54(2):230-246. doi: 10.1016/j.jmp.2009.12.004. Epub 2010 Jan 13.

Do humans produce the speed-accuracy trade-off that maximizes reward rate?人类是否会产生使奖励率最大化的速度-准确性权衡？

Q J Exp Psychol (Hove). 2010 May;63(5):863-91. doi: 10.1080/17470210903091643. Epub 2009 Sep 10.

Why do we slow down after an error? Mechanisms underlying the effects of posterror slowing.我们在犯错后为什么会放慢速度？犯错后放慢速度效应的潜在机制。

Q J Exp Psychol (Hove). 2009 Feb;62(2):209-18. doi: 10.1080/17470210802240655. Epub 2008 Aug 8.

Fitting the Ratcliff diffusion model to experimental data.将拉特克利夫扩散模型拟合到实验数据。

Psychon Bull Rev. 2007 Dec;14(6):1011-26. doi: 10.3758/bf03193087.

The diffusion decision model: theory and data for two-choice decision tasks.扩散决策模型：二选一决策任务的理论与数据

Neural Comput. 2008 Apr;20(4):873-922. doi: 10.1162/neco.2008.12-06-420.

Psychol Rev. 2006 Oct;113(4):700-65. doi: 10.1037/0033-295X.113.4.700.

Rapid decision threshold modulation by reward rate in a neural network.神经网络中奖励率对快速决策阈值的调制

Neural Netw. 2006 Oct;19(8):1013-26. doi: 10.1016/j.neunet.2006.05.038. Epub 2006 Sep 20.

Risks of drawing inferences about cognitive processes from model fits to individual versus average performance.从模型拟合个体表现与平均表现来推断认知过程的风险。

Psychon Bull Rev. 2005 Jun;12(3):403-8. doi: 10.3758/bf03193784.

The effect of stimulus strength on the speed and accuracy of a perceptual decision.刺激强度对知觉决策的速度和准确性的影响。

J Vis. 2005 May 2;5(5):376-404. doi: 10.1167/5.5.1.

Interpreting the parameters of the diffusion model: an empirical validation.解读扩散模型的参数：一项实证验证。

Mem Cognit. 2004 Oct;32(7):1206-20. doi: 10.3758/bf03196893.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验