Eppinger Ben, Walter Maik, Li Shu-Chen
Chair of Lifespan Developmental Neuroscience, Department of Psychology, TU Dresden, Dresden, Germany.
Department of Psychology, Concordia University, Loyola Campus, 7141 Sherbrooke Street W., Montreal, Quebec, Canada.
Cogn Affect Behav Neurosci. 2017 Apr;17(2):406-421. doi: 10.3758/s13415-016-0487-3.
In this study, we investigated the interplay of habitual (model-free) and goal-directed (model-based) decision processes by using a two-stage Markov decision task in combination with event-related potentials (ERPs) and computational modeling. To manipulate the demands on model-based decision making, we applied two experimental conditions with different probabilities of transitioning from the first to the second stage of the task. As we expected, when the stage transitions were more predictable, participants showed greater model-based (planning) behavior. Consistent with this result, we found that stimulus-evoked parietal (P300) activity at the second stage of the task increased with the predictability of the state transitions. However, the parietal activity also reflected model-free information about the expected values of the stimuli, indicating that at this stage of the task both types of information are integrated to guide decision making. Outcome-related ERP components only reflected reward-related processes: Specifically, a medial prefrontal ERP component (the feedback-related negativity) was sensitive to negative outcomes, whereas a component that is elicited by reward (the feedback-related positivity) increased as a function of positive prediction errors. Taken together, our data indicate that stimulus-locked parietal activity reflects the integration of model-based and model-free information during decision making, whereas feedback-related medial prefrontal signals primarily reflect reward-related decision processes.
在本研究中,我们通过使用两阶段马尔可夫决策任务结合事件相关电位(ERP)和计算建模,研究了习惯性(无模型)和目标导向性(基于模型)决策过程之间的相互作用。为了操纵对基于模型决策的要求,我们应用了两种实验条件,任务从第一阶段过渡到第二阶段的概率不同。正如我们所预期的,当阶段转换更具可预测性时,参与者表现出更强的基于模型(规划)行为。与此结果一致,我们发现在任务的第二阶段,刺激诱发的顶叶(P300)活动随着状态转换的可预测性而增加。然而,顶叶活动也反映了关于刺激预期值的无模型信息,表明在任务的这个阶段,两种类型的信息都被整合起来以指导决策。与结果相关的ERP成分仅反映了与奖励相关的过程:具体而言,内侧前额叶ERP成分(反馈相关负波)对负面结果敏感,而由奖励诱发的成分(反馈相关正波)随着正预测误差而增加。综上所述,我们的数据表明,刺激锁定的顶叶活动反映了决策过程中基于模型和无模型信息的整合,而与反馈相关的内侧前额叶信号主要反映与奖励相关的决策过程。