Story Giles W, Vlaev Ivo, Seymour Ben, Darzi Ara, Dolan Raymond J
Department of Surgery and Cancer, Centre for Health Policy, Institute of Global Health Innovation, Imperial College London London, UK ; Wellcome Trust Centre for Neuroimaging, Institute of Neurology, University College London London, UK.
Department of Surgery and Cancer, Centre for Health Policy, Institute of Global Health Innovation, Imperial College London London, UK.
Front Behav Neurosci. 2014 Mar 12;8:76. doi: 10.3389/fnbeh.2014.00076. eCollection 2014.
The tendency to make unhealthy choices is hypothesized to be related to an individual's temporal discount rate, the theoretical rate at which they devalue delayed rewards. Furthermore, a particular form of temporal discounting, hyperbolic discounting, has been proposed to explain why unhealthy behavior can occur despite healthy intentions. We examine these two hypotheses in turn. We first systematically review studies which investigate whether discount rates can predict unhealthy behavior. These studies reveal that high discount rates for money (and in some instances food or drug rewards) are associated with several unhealthy behaviors and markers of health status, establishing discounting as a promising predictive measure. We secondly examine whether intention-incongruent unhealthy actions are consistent with hyperbolic discounting. We conclude that intention-incongruent actions are often triggered by environmental cues or changes in motivational state, whose effects are not parameterized by hyperbolic discounting. We propose a framework for understanding these state-based effects in terms of the interplay of two distinct reinforcement learning mechanisms: a "model-based" (or goal-directed) system and a "model-free" (or habitual) system. Under this framework, while discounting of delayed health may contribute to the initiation of unhealthy behavior, with repetition, many unhealthy behaviors become habitual; if health goals then change, habitual behavior can still arise in response to environmental cues. We propose that the burgeoning development of computational models of these processes will permit further identification of health decision-making phenotypes.
做出不健康选择的倾向被假设与个体的时间贴现率有关,时间贴现率是指个体对延迟奖励进行贬值的理论比率。此外,一种特殊形式的时间贴现,即双曲线贴现,已被提出用于解释为何尽管有健康的意图,不健康行为仍会发生。我们依次检验这两个假设。我们首先系统地回顾那些调查贴现率是否能预测不健康行为的研究。这些研究表明,对金钱(以及在某些情况下对食物或药物奖励)的高贴现率与几种不健康行为及健康状况指标相关,这表明贴现是一种有前景的预测指标。其次,我们检验与意图不符的不健康行为是否与双曲线贴现一致。我们得出的结论是,与意图不符的行为通常由环境线索或动机状态的变化触发,而双曲线贴现并未对这些影响进行参数化。我们提出了一个框架,用于从两种不同的强化学习机制的相互作用角度理解这些基于状态的影响:一个“基于模型”(或目标导向)系统和一个“无模型”(或习惯性)系统。在此框架下,虽然对延迟健康的贴现可能促使不健康行为的开始,但随着重复,许多不健康行为会变成习惯;如果健康目标随后改变,习惯性行为仍可能因环境线索而出现。我们认为,这些过程的计算模型的蓬勃发展将有助于进一步识别健康决策表型。