Wellcome Trust Centre for Neuroimaging, University College of London, London, United Kingdom.
PLoS One. 2010 Dec 14;5(12):e15555. doi: 10.1371/journal.pone.0015555.
In a companion paper [1], we have presented a generic approach for inferring how subjects make optimal decisions under uncertainty. From a Bayesian decision theoretic perspective, uncertain representations correspond to "posterior" beliefs, which result from integrating (sensory) information with subjective "prior" beliefs. Preferences and goals are encoded through a "loss" (or "utility") function, which measures the cost incurred by making any admissible decision for any given (hidden or unknown) state of the world. By assuming that subjects make optimal decisions on the basis of updated (posterior) beliefs and utility (loss) functions, one can evaluate the likelihood of observed behaviour. In this paper, we describe a concrete implementation of this meta-Bayesian approach (i.e. a Bayesian treatment of Bayesian decision theoretic predictions) and demonstrate its utility by applying it to both simulated and empirical reaction time data from an associative learning task. Here, inter-trial variability in reaction times is modelled as reflecting the dynamics of the subjects' internal recognition process, i.e. the updating of representations (posterior densities) of hidden states over trials while subjects learn probabilistic audio-visual associations. We use this paradigm to demonstrate that our meta-Bayesian framework allows for (i) probabilistic inference on the dynamics of the subject's representation of environmental states, and for (ii) model selection to disambiguate between alternative preferences (loss functions) human subjects could employ when dealing with trade-offs, such as between speed and accuracy. Finally, we illustrate how our approach can be used to quantify subjective beliefs and preferences that underlie inter-individual differences in behaviour.
在一篇相关论文[1]中,我们提出了一种推断主体在不确定情况下如何做出最优决策的通用方法。从贝叶斯决策理论的角度来看,不确定的表示对应于“后验”信念,这些信念是通过将(感觉)信息与主观“先验”信念进行整合而产生的。偏好和目标通过“损失”(或“效用”)函数进行编码,该函数衡量做出任何可接受的决策对于任何给定(隐藏或未知)世界状态所产生的成本。通过假设主体基于更新后的(后验)信念和效用(损失)函数做出最优决策,可以评估观察到的行为的可能性。在本文中,我们描述了这种元贝叶斯方法的具体实现(即对贝叶斯决策理论预测的贝叶斯处理),并通过将其应用于联想学习任务的模拟和经验反应时间数据来证明其效用。在这里,反应时间中的试验间可变性被建模为反映主体内部识别过程的动态,即主体在学习概率性视听关联时对隐藏状态的表示(后验密度)进行更新。我们使用这个范例来证明我们的元贝叶斯框架允许(i)对主体对环境状态表示的动态进行概率推断,以及(ii)进行模型选择以区分人类主体在处理权衡(例如速度和准确性之间的权衡)时可能采用的替代偏好(损失函数)。最后,我们说明了如何使用我们的方法来量化行为个体差异背后的主观信念和偏好。