Department of Complexity Science and Engineering, Graduate School of Frontier Sciences, The University of Tokyo Kashiwa, Japan.
Center for Evolutionary Cognitive Sciences, The University of Tokyo Tokyo, Japan ; RIKEN Brain Science Institute Wako, Japan ; Okanoya Emotional Information Project, Exploratory Research for Advanced Technology (ERATO), Japan Science and Technology Agency Wako, Japan.
Front Comput Neurosci. 2014 Mar 4;8:18. doi: 10.3389/fncom.2014.00018. eCollection 2014.
The decision making behaviors of humans and animals adapt and then satisfy an "operant matching law" in certain type of tasks. This was first pointed out by Herrnstein in his foraging experiments on pigeons. The matching law has been one landmark for elucidating the underlying processes of decision making and its learning in the brain. An interesting question is whether decisions are made deterministically or probabilistically. Conventional learning models of the matching law are based on the latter idea; they assume that subjects learn choice probabilities of respective alternatives and decide stochastically with the probabilities. However, it is unknown whether the matching law can be accounted for by a deterministic strategy or not. To answer this question, we propose several deterministic Bayesian decision making models that have certain incorrect beliefs about an environment. We claim that a simple model produces behavior satisfying the matching law in static settings of a foraging task but not in dynamic settings. We found that the model that has a belief that the environment is volatile works well in the dynamic foraging task and exhibits undermatching, which is a slight deviation from the matching law observed in many experiments. This model also demonstrates the double-exponential reward history dependency of a choice and a heavier-tailed run-length distribution, as has recently been reported in experiments on monkeys.
人类和动物的决策行为会适应环境,并在某些类型的任务中满足“操作性匹配定律”。这是赫恩斯坦(Herrnstein)在对鸽子的觅食实验中首先指出的。匹配定律一直是阐明大脑中决策及其学习的基础过程的一个里程碑。一个有趣的问题是,决策是确定性的还是概率性的。传统的匹配定律学习模型基于后一种观点;它们假设主体学习各自替代方案的选择概率,并根据概率进行随机决策。然而,目前尚不清楚匹配定律是否可以通过确定性策略来解释。为了回答这个问题,我们提出了几个确定性贝叶斯决策模型,这些模型对环境有某些错误的信念。我们声称,一个简单的模型在觅食任务的静态环境中产生符合匹配定律的行为,但在动态环境中则不然。我们发现,具有环境不稳定信念的模型在动态觅食任务中表现良好,并表现出低估,这与许多实验中观察到的轻微偏离匹配定律的情况一致。该模型还展示了选择的双指数奖励历史依赖性和运行长度分布的更重尾,这与最近在猴子实验中报告的情况一致。