Wellcome Trust Centre for Neuroimaging at UCL, 12 Queen Square, London WC1N 3BG, UK.
Cognition. 2011 Jun;119(3):394-402. doi: 10.1016/j.cognition.2011.02.004. Epub 2011 Feb 26.
Action-outcome contingencies can be learnt either by active trial-and-error, or vicariously, by observing the outcomes of actions performed by others. The extant literature is ambiguous as to which of these modes of learning is more effective, as controlled comparisons of operant and observational learning are rare. Here, we contrasted human operant and observational value learning, assessing implicit and explicit measures of learning from positive and negative reinforcement. Compared to direct operant learning, we show observational learning is associated with an optimistic over-valuation of low-value options, a pattern apparent both in participants' choice preferences and their explicit post-hoc estimates of value. Learning of higher value options showed no such bias. We suggest that such a bias can be explained as a tendency for optimistic underestimation of the chance of experiencing negative events, an optimism repressed when information is gathered through direct operant learning.
行为-结果关联既可以通过主动试错,也可以通过观察他人的行为及其结果来间接习得。然而,关于哪种学习模式更为有效,现有文献的结论并不明确,因为很少有对操作性学习和观察性学习的对照研究。在这里,我们比较了人类的操作性学习和观察性学习,评估了从正强化和负强化中学习的内隐和外显测量。与直接操作性学习相比,我们发现观察性学习与对低价值选项的过度乐观高估有关,这种模式在参与者的选择偏好和他们对价值的明确事后估计中都很明显。对于高价值选项的学习则没有这种偏见。我们认为,这种偏见可以解释为对经历负面事件的可能性的乐观低估,当通过直接操作性学习收集信息时,这种乐观主义会被压抑。