Department of Psychology, Faculty of Education, Qufu Normal University, Qufu, Shandong, China.
Department of Psychology, Faculty of Education, Qufu Normal University, Qufu, Shandong, China.
Int J Psychophysiol. 2020 Apr;150:11-19. doi: 10.1016/j.ijpsycho.2020.01.004. Epub 2020 Jan 23.
Effective behavior monitoring, including internal monitoring/error detection and external monitoring/feedback, is very pivotal for reinforcement learning. However, less attention has been paid to internal monitoring and the dynamic learning performance in reinforcement learning, and there is still a heated debate on which kind of external feedback is relied on in the reinforcement learning. In order to address these questions, an adaption probabilistic selection task was used to examine the effect of the internal monitoring, external feedback and the relationship between them for approach learners and avoidance learners during dynamic learning process of reinforcement learning and behavior adaption. Error-related negativity (ERN), feedback-related negativity (FRN) and feedback-related P300 are three ERPs components, which can be used as the indexes of internal monitoring, external feedback and behavior adaption. For our results, the ERN effect of avoidance learners become large in block 3, which is earlier than approach learners (block 4). This phenomenon suggests that avoidance learners learned faster than approach learners. In addition, the FRN amplitude of avoidance learners in block 4 was significantly smaller than the other three blocks. The aforementioned results demonstrated a tradeoff relationship between the ERN and FRN effects.
有效的行为监测,包括内部监测/错误检测和外部监测/反馈,对强化学习非常关键。然而,强化学习中内部监测和动态学习性能的研究较少,强化学习中依赖哪种外部反馈仍存在激烈的争论。为了解决这些问题,采用适应概率选择任务来检验内部监测、外部反馈以及它们之间的关系,对于接近学习者和回避学习者在强化学习和行为适应的动态学习过程中的影响。错误相关负波(ERN)、反馈相关负波(FRN)和反馈相关 P300 是三种 ERP 成分,可作为内部监测、外部反馈和行为适应的指标。对于我们的结果,回避学习者的 ERN 效应在第 3 块中变得较大,早于接近学习者(第 4 块)。这一现象表明,回避学习者比接近学习者学习得更快。此外,第 4 块中回避学习者的 FRN 振幅明显小于其他三个块。上述结果表明 ERN 和 FRN 效应之间存在权衡关系。