University of Utah, Department of Health, Kinesiology, and Recreation, United States; University of Utah, Department of Physical Therapy and Athletic Training, United States.
Auburn University, School of Kinesiology, United States; Auburn University, Center for Neuroscience, United States.
Biol Psychol. 2020 Jan;149:107775. doi: 10.1016/j.biopsycho.2019.107775. Epub 2019 Sep 26.
Reward positivity (RewP) is an EEG component reflecting reward-prediction errors. Using multilevel models, we measured single-trial RewP amplitude from trial-to-trial, while reward and prediction varied during learning. Sixty participants completed a category-learning task in either engaging or sterile conditions with the RewP time-locked to feedback. Sequential analysis of single-trial RewP showed its relationship to current and previous accuracy, and the probability of changing one's response to subsequent stimuli. Simulations show these effects can be explained in detail by the dynamics of participants' expectations according to principles of reinforcement learning. The single-trial RewP findings were consistent with previous literature linking RewP to reward-prediction error under reinforcement-learning theory. In contrast, the aggregate RewP was unrelated to the engagement manipulation or to delayed retention performance. Thus the present results provide a detailed computational account how RewP relates to acute adaptation, but suggest RewP plays little role in long-term learning.
奖励正波(RewP)是反映奖励预测误差的一种 EEG 成分。我们使用多层模型,在学习过程中随着奖励和预测的变化,从一次试验到另一次试验测量单次试验 RewP 幅度。60 名参与者在参与或无菌条件下完成了一项类别学习任务,RewP 与反馈时间锁定。对单次试验 RewP 的序列分析表明,它与当前和以前的准确性以及对后续刺激改变反应的可能性有关。模拟表明,根据强化学习的原则,参与者的期望动态可以详细解释这些影响。单次试验 RewP 的发现与强化学习理论下 RewP 与奖励预测误差相关的先前文献一致。相比之下,总体 RewP 与参与操作或延迟保留表现无关。因此,目前的结果提供了一个详细的计算说明,说明 RewP 如何与急性适应相关,但表明 RewP 在长期学习中作用不大。