Brain Function and Psychological Science Research Center, Shenzhen University, No 3688, Nanhai Road, Nanshan District, Shenzhen, 518060, China.
Shenzhen Key Laboratory of Affective and Social Cognitive Science, Shenzhen University, Shenzhen, China.
Cogn Affect Behav Neurosci. 2018 Oct;18(5):949-963. doi: 10.3758/s13415-018-0615-3.
Although a growing number of studies have investigated the neural mechanisms of reinforcement learning, it remains unclear how the brain responds to feedback that is unreliable. A recent theory proposes that the reward positivity (RewP) component of the event-related brain potential (ERP) and frontal midline theta (FMT) power reflect separate feedback-related processing functions of anterior cingulate cortex (ACC). In the present study, the electroencephalogram (EEG) was recorded from participants as they engaged in a time estimation task in which feedback reliability was manipulated across conditions. After each response, they received a cue that indicated that the following feedback stimulus was 100%, 75%, or 50% reliable. The results showed that participants' time estimates adjusted linearly according to the feedback reliability. Moreover, presentation of the cue indicating 100% reliability elicited a larger RewP-like ERP component than the other cues did, and feedback presentation elicited a RewP of approximately equal amplitude for all of the three reliability conditions. By contrast, FMT power elicited by negative feedback decreased linearly from the 100% condition to 75% and 50% condition, and only FMT power predicted behavioral adjustments on the following trials. In addition, an analysis of Beta power and cross-frequency coupling (CFC) of Beta power with FMT phase suggested that Beta-FMT communication modulated motor areas for the purpose of adjusting behavior. We interpreted these findings in terms of the hierarchical reinforcement learning account of ACC, in which the RewP and FMT are proposed to reflect reward processing and control functions of ACC, respectively.
尽管越来越多的研究探讨了强化学习的神经机制,但大脑对不可靠反馈的反应仍不清楚。最近的一种理论提出,事件相关脑电位(ERP)中的奖励正波(RewP)成分和额中线theta(FMT)功率反映了前扣带皮层(ACC)的分离反馈相关处理功能。在本研究中,当参与者在时间估计任务中进行时,记录了他们的脑电图(EEG),其中反馈可靠性在条件之间进行了操纵。每次反应后,他们都会收到一个提示,表明下一个反馈刺激的可靠性为 100%、75%或 50%。结果表明,参与者的时间估计根据反馈可靠性线性调整。此外,指示 100%可靠性的提示引发的 RewP 样 ERP 成分大于其他提示,而反馈提示在所有三种可靠性条件下引发的 RewP 幅度大致相等。相比之下,负反馈引起的 FMT 功率从 100%条件线性下降到 75%和 50%条件,只有 FMT 功率可以预测下一次试验的行为调整。此外,对 Beta 功率和 Beta 功率与 FMT 相位的交叉频率耦合(CFC)的分析表明,Beta-FMT 通信调节了运动区域,以调整行为。我们根据 ACC 的分层强化学习理论解释了这些发现,该理论提出 RewP 和 FMT 分别反映了 ACC 的奖励处理和控制功能。