Suppr超能文献

运动过程中的强化学习。

Reinforcement Learning during Locomotion.

机构信息

Department of Physical Therapy, University of Delaware, Newark, Delaware 19713.

Interdisciplinary Graduate Program in Biomechanics & Movement Science, University of Delaware, Newark, Delaware 19713.

出版信息

eNeuro. 2024 Mar 15;11(3). doi: 10.1523/ENEURO.0383-23.2024. Print 2024 Mar.

Abstract

When learning a new motor skill, people often must use trial and error to discover which movement is best. In the reinforcement learning framework, this concept is known as exploration and has been linked to increased movement variability in motor tasks. For locomotor tasks, however, increased variability decreases upright stability. As such, exploration during gait may jeopardize balance and safety, making reinforcement learning less effective. Therefore, we set out to determine if humans could acquire and retain a novel locomotor pattern using reinforcement learning alone. Young healthy male and female participants walked on a treadmill and were provided with binary reward feedback (indicated by a green checkmark on the screen) that was tied to a fixed monetary bonus, to learn a novel stepping pattern. We also recruited a comparison group who walked with the same novel stepping pattern but did so by correcting for target error, induced by providing real-time veridical visual feedback of steps and a target. In two experiments, we compared learning, motor variability, and two forms of motor memories between the groups. We found that individuals in the binary reward group did, in fact, acquire the new walking pattern by exploring (increasing motor variability). Additionally, while reinforcement learning did not increase implicit motor memories, it resulted in more accurate explicit motor memories compared with the target error group. Overall, these results demonstrate that humans can acquire new walking patterns with reinforcement learning and retain much of the learning over 24 h.

摘要

当学习新的运动技能时,人们通常必须通过反复试验来发现哪种动作最好。在强化学习框架中,这个概念被称为探索,并与运动任务中运动变异性的增加有关。然而,对于步行任务,变异性的增加会降低直立稳定性。因此,探索步态可能会危及平衡和安全,从而降低强化学习的效果。因此,我们着手确定人类是否可以仅通过强化学习来获得和保留新的运动模式。年轻健康的男性和女性参与者在跑步机上行走,并提供二进制奖励反馈(屏幕上显示一个绿色复选标记),该反馈与固定的货币奖励挂钩,以学习新的步幅模式。我们还招募了一个对照组,他们以同样的新步幅模式行走,但通过实时提供真实视觉反馈和目标来纠正目标误差来实现。在两个实验中,我们比较了两组之间的学习、运动变异性和两种运动记忆形式。我们发现,实际上,通过探索(增加运动变异性),二进制奖励组的个体确实获得了新的行走模式。此外,尽管强化学习没有增加内隐运动记忆,但与目标误差组相比,它产生了更准确的外显运动记忆。总的来说,这些结果表明,人类可以通过强化学习获得新的行走模式,并在 24 小时内保留大部分学习内容。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cc75/10946027/4138ac50ea9d/eneuro-11-ENEURO.0383-23.2024-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验