Suppr超能文献

运动技能习得中的强化学习:利用奖励积极性来理解短期和长期行为适应背后的机制。

Reinforcement learning in motor skill acquisition: using the reward positivity to understand the mechanisms underlying short- and long-term behavior adaptation.

作者信息

Bacelar Mariane F B, Lohse Keith R, Parma Juliana O, Miller Matthew W

机构信息

Department of Kinesiology, Boise State University, Boise, ID, United States.

Program in Physical Therapy, Washington University School of Medicine, St. Louis, MO, United States.

出版信息

Front Behav Neurosci. 2024 Oct 30;18:1466970. doi: 10.3389/fnbeh.2024.1466970. eCollection 2024.

Abstract

INTRODUCTION

According to reinforcement learning, humans adjust their behavior based on the difference between actual and anticipated outcomes (i.e., prediction error) with the main goal of maximizing rewards through their actions. Despite offering a strong theoretical framework to understand how we acquire motor skills, very few studies have investigated reinforcement learning predictions and its underlying mechanisms in motor skill acquisition.

METHODS

In the present study, we explored a 134-person dataset consisting of learners' feedback-evoked brain activity (reward positivity; RewP) and motor accuracy during the practice phase and delayed retention test to investigate whether these variables interacted according to reinforcement learning predictions.

RESULTS

Results showed a non-linear relationship between RewP and trial accuracy, which was moderated by the learners' performance level. Specifically, high-performing learners were more sensitive to violations in reward expectations compared to low-performing learners, likely because they developed a stronger representation of the skill and were able to rely on more stable outcome predictions. Furthermore, contrary to our prediction, the average RewP during acquisition did not predict performance on the delayed retention test.

DISCUSSION

Together, these findings support the use of reinforcement learning models to understand short-term behavior adaptation and highlight the complexity of the motor skill consolidation process, which would benefit from a multi-mechanistic approach to further our understanding of this phenomenon.

摘要

引言

根据强化学习理论,人类会根据实际结果与预期结果之间的差异(即预测误差)来调整自己的行为,其主要目标是通过行动最大化奖励。尽管强化学习为理解我们如何获得运动技能提供了一个强大的理论框架,但很少有研究探讨强化学习预测及其在运动技能习得中的潜在机制。

方法

在本研究中,我们探索了一个包含134人的数据集,该数据集包括学习者在练习阶段和延迟保留测试期间的反馈诱发脑活动(奖励正性;RewP)和运动准确性,以研究这些变量是否根据强化学习预测相互作用。

结果

结果显示RewP与试验准确性之间存在非线性关系,这种关系受学习者表现水平的调节。具体而言,与低表现学习者相比,高表现学习者对奖励期望的违反更为敏感,这可能是因为他们对技能形成了更强的表征,并且能够依赖更稳定的结果预测。此外,与我们的预测相反,习得过程中的平均RewP并不能预测延迟保留测试中的表现。

讨论

总之,这些发现支持使用强化学习模型来理解短期行为适应,并突出了运动技能巩固过程的复杂性,这将受益于多机制方法,以进一步加深我们对这一现象的理解。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验