Suppr超能文献

运动强化学习中的信用分配。

Credit assignment during movement reinforcement learning.

机构信息

Department of Behavioral Sciences, University of Rio Grande, Rio Grande, Ohio, USA.

出版信息

PLoS One. 2013;8(2):e55352. doi: 10.1371/journal.pone.0055352. Epub 2013 Feb 8.

Abstract

We often need to learn how to move based on a single performance measure that reflects the overall success of our movements. However, movements have many properties, such as their trajectories, speeds and timing of end-points, thus the brain needs to decide which properties of movements should be improved; it needs to solve the credit assignment problem. Currently, little is known about how humans solve credit assignment problems in the context of reinforcement learning. Here we tested how human participants solve such problems during a trajectory-learning task. Without an explicitly-defined target movement, participants made hand reaches and received monetary rewards as feedback on a trial-by-trial basis. The curvature and direction of the attempted reach trajectories determined the monetary rewards received in a manner that can be manipulated experimentally. Based on the history of action-reward pairs, participants quickly solved the credit assignment problem and learned the implicit payoff function. A Bayesian credit-assignment model with built-in forgetting accurately predicts their trial-by-trial learning.

摘要

我们经常需要根据单一的绩效指标来学习运动,而该指标反映了运动的整体成功。然而,运动具有许多特性,例如其轨迹、速度和端点的时间,因此大脑需要决定应该改进运动的哪些特性;它需要解决信用分配问题。目前,关于人类如何在强化学习的背景下解决信用分配问题,人们知之甚少。在这里,我们测试了人类参与者在轨迹学习任务中如何解决这些问题。在没有明确定义的目标运动的情况下,参与者进行手的伸展,并在每次试验中收到金钱奖励作为反馈。尝试的伸展轨迹的曲率和方向以可以通过实验操纵的方式确定收到的金钱奖励。根据动作-奖励对的历史,参与者迅速解决了信用分配问题并学习了隐性收益函数。具有内置遗忘功能的贝叶斯信用分配模型准确地预测了他们的逐次学习。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fed4/3568147/40bbb2c60ab5/pone.0055352.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验