Navalpakkam Vidhya, Koch Christof, Perona Pietro
Division of Biology, California Institute of Technology, Pasadena, CA 91125, USA.
J Vis. 2009 Jan 23;9(1):31.1-16. doi: 10.1167/9.1.31.
How do reward outcomes affect early visual performance? Previous studies found a suboptimal influence, but they ignored the non-linearity in how subjects perceived the reward outcomes. In contrast, we find that when the non-linearity is accounted for, humans behave optimally and maximize expected reward. Our subjects were asked to detect the presence of a familiar target object in a cluttered scene. They were rewarded according to their performance. We systematically varied the target frequency and the reward/penalty policy for detecting/missing the targets. We find that 1) decreasing the target frequency will decrease the detection rates, in accordance with the literature. 2) Contrary to previous studies, increasing the target detection rewards will compensate for target rarity and restore detection performance. 3) A quantitative model based on reward maximization accurately predicts human detection behavior in all target frequency and reward conditions; thus, reward schemes can be designed to obtain desired detection rates for rare targets. 4) Subjects quickly learn the optimal decision strategy; we propose a neurally plausible model that exhibits the same properties. Potential applications include designing reward schemes to improve detection of life-critical, rare targets (e.g., cancers in medical images).
奖励结果如何影响早期视觉表现?先前的研究发现了一种次优影响,但它们忽略了受试者感知奖励结果方式中的非线性。相比之下,我们发现当考虑到非线性时,人类的行为是最优的,并能使预期奖励最大化。我们要求受试者在杂乱场景中检测熟悉目标物体的存在。根据他们的表现给予奖励。我们系统地改变了目标频率以及检测/漏检目标的奖励/惩罚策略。我们发现:1)根据文献,降低目标频率会降低检测率。2)与先前的研究相反,增加目标检测奖励将弥补目标的稀有性并恢复检测表现。3)基于奖励最大化的定量模型能准确预测所有目标频率和奖励条件下的人类检测行为;因此,可以设计奖励方案以获得对稀有目标的期望检测率。4)受试者能快速学习最优决策策略;我们提出了一个具有相同特性的神经合理模型。潜在应用包括设计奖励方案以改善对危及生命的稀有目标(如医学图像中的癌症)的检测。