Xu Shuyuan, Sun Yuyan, Huang Min, Huang Yanhong, Han Jing, Tang Xuemei, Ren Wei
MOE Key Laboratory of Modern Teaching Technology, Shaanxi Normal University, Xi'an, China.
School of Foreign Studies, Anhui Polytechnic University, Wuhu, China.
Front Psychol. 2021 May 10;12:647263. doi: 10.3389/fpsyg.2021.647263. eCollection 2021.
Reinforcement learning relies on the reward prediction error (RPE) signals conveyed by the midbrain dopamine system. Previous studies showed that dopamine plays an important role in both positive and negative reinforcement. However, whether various reinforcement processes will induce distinct learning signals is still unclear. In a probabilistic learning task, we examined RPE signals in different reinforcement types using an electrophysiology index, namely, the feedback-related negativity (FRN). Ninety-four participants were randomly assigned into four groups: base (no money incentive), positive reinforcement (presentation of money rewards), negative reinforcement (removal of money losses), and combined reinforcement (money rewards and removal of money losses) groups. In addition, in order to evaluate the engagement of emotional activity in the different reinforcement processes, Positive and Negative Affect Schedule-Expanded Form (PANAS-X) scales were applied before and after the experiment to detect the emotional changes. The results showed that there was no difference between groups in the dopamine-related learning bias. However, compared to the other three groups, negative reinforcement elicited smaller FRN (the difference-wave measure) during the learning, stronger positive affect and joviality, and less fatigue after the learning, in which the difference between the negative and positive reinforcement groups was smaller. The results indicated that pure avoidance motivation may induce distinct emotional fluctuations, which influence the feedback processing.
强化学习依赖于中脑多巴胺系统传递的奖励预测误差(RPE)信号。先前的研究表明,多巴胺在正向和负向强化中均发挥着重要作用。然而,各种强化过程是否会引发不同的学习信号仍不明确。在一项概率学习任务中,我们使用一种电生理指标,即反馈相关负波(FRN),来检测不同强化类型中的RPE信号。94名参与者被随机分为四组:基础组(无金钱激励)、正向强化组(给予金钱奖励)、负向强化组(消除金钱损失)和联合强化组(给予金钱奖励并消除金钱损失)。此外,为了评估不同强化过程中情绪活动的参与情况,在实验前后应用正负情绪量表扩展版(PANAS-X)来检测情绪变化。结果显示,多巴胺相关的学习偏差在各组之间没有差异。然而,与其他三组相比,负向强化在学习过程中引发的FRN(差值波测量)较小,学习后产生更强烈的积极情绪和愉悦感,且疲劳感更少,其中负向强化组和正向强化组之间的差异较小。结果表明,单纯的回避动机可能会引发不同的情绪波动,进而影响反馈处理。