使用人类生成的奖励训练用于手臂运动的 Actor-Critic 强化学习控制器。

Training an Actor-Critic Reinforcement Learning Controller for Arm Movement Using Human-Generated Rewards.

出版信息

IEEE Trans Neural Syst Rehabil Eng. 2017 Oct;25(10):1892-1905. doi: 10.1109/TNSRE.2017.2700395. Epub 2017 May 2.

DOI:10.1109/TNSRE.2017.2700395

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7523734/

Abstract

Functional Electrical Stimulation (FES) employs neuroprostheses to apply electrical current to the nerves and muscles of individuals paralyzed by spinal cord injury to restore voluntary movement. Neuroprosthesis controllers calculate stimulation patterns to produce desired actions. To date, no existing controller is able to efficiently adapt its control strategy to the wide range of possible physiological arm characteristics, reaching movements, and user preferences that vary over time. Reinforcement learning (RL) is a control strategy that can incorporate human reward signals as inputs to allow human users to shape controller behavior. In this paper, ten neurologically intact human participants assigned subjective numerical rewards to train RL controllers, evaluating animations of goal-oriented reaching tasks performed using a planar musculoskeletal human arm simulation. The RL controller learning achieved using human trainers was compared with learning accomplished using human-like rewards generated by an algorithm; metrics included success at reaching the specified target; time required to reach the target; and target overshoot. Both sets of controllers learned efficiently and with minimal differences, significantly outperforming standard controllers. Reward positivity and consistency were found to be unrelated to learning success. These results suggest that human rewards can be used effectively to train RL-based FES controllers.

摘要

功能性电刺激 (FES) 使用神经假体将电流应用于因脊髓损伤而瘫痪的个体的神经和肌肉，以恢复自主运动。神经假体控制器计算刺激模式以产生所需的动作。迄今为止，没有现有的控制器能够有效地将其控制策略适应广泛的可能的生理手臂特征、到达运动和随时间变化的用户偏好。强化学习 (RL) 是一种控制策略，它可以将人类奖励信号作为输入，允许人类用户塑造控制器行为。在本文中，十位神经完整的人类参与者为 RL 控制器分配主观数值奖励，评估使用平面肌肉骨骼人体手臂模拟执行的目标导向到达任务的动画。使用人类培训师进行的 RL 控制器学习与使用算法生成的类似人类的奖励进行的学习进行了比较；指标包括达到指定目标的成功率；达到目标所需的时间；以及目标超调。两组控制器都高效地学习，并且差异很小，明显优于标准控制器。发现奖励积极性和一致性与学习成功无关。这些结果表明，人类奖励可以有效地用于训练基于 RL 的 FES 控制器。

相似文献

Training an Actor-Critic Reinforcement Learning Controller for Arm Movement Using Human-Generated Rewards.使用人类生成的奖励训练用于手臂运动的 Actor-Critic 强化学习控制器。

IEEE Trans Neural Syst Rehabil Eng. 2017 Oct;25(10):1892-1905. doi: 10.1109/TNSRE.2017.2700395. Epub 2017 May 2.

Hindsight Experience Replay Improves Reinforcement Learning for Control of a MIMO Musculoskeletal Model of the Human Arm.后见之明经验回放改进了多输入多输出人体手臂运动骨骼肌肉模型的强化学习控制。

IEEE Trans Neural Syst Rehabil Eng. 2021;29:1016-1025. doi: 10.1109/TNSRE.2021.3081056. Epub 2021 Jun 8.

Computer simulation of FES standing up in paraplegia: a self-adaptive fuzzy controller with reinforcement learning.截瘫患者功能性电刺激站立的计算机模拟：一种基于强化学习的自适应模糊控制器

IEEE Trans Rehabil Eng. 1998 Jun;6(2):151-61. doi: 10.1109/86.681180.

An optimized proportional-derivative controller for the human upper extremity with gravity.一种针对带重力的人体上肢的优化比例-微分控制器。

J Biomech. 2015 Oct 15;48(13):3692-700. doi: 10.1016/j.jbiomech.2015.08.016. Epub 2015 Aug 29.

A neural tracking and motor control approach to improve rehabilitation of upper limb movements.一种用于改善上肢运动康复的神经跟踪与运动控制方法。

J Neuroeng Rehabil. 2008 Feb 5;5:5. doi: 10.1186/1743-0003-5-5.

Feasibility of EMG-based neural network controller for an upper extremity neuroprosthesis.用于上肢神经假体的基于肌电图的神经网络控制器的可行性。

IEEE Trans Neural Syst Rehabil Eng. 2009 Feb;17(1):80-90. doi: 10.1109/TNSRE.2008.2010480.

Human locomotion with reinforcement learning using bioinspired reward reshaping strategies.基于生物启发式奖励重塑策略的强化学习的人类运动。

Med Biol Eng Comput. 2021 Jan;59(1):243-256. doi: 10.1007/s11517-020-02309-3. Epub 2021 Jan 8.

Reinforcement Learning based Decoding Using Internal Reward for Time Delayed Task in Brain Machine Interfaces.基于强化学习的解码：利用内部奖励实现脑机接口中的时延任务

Annu Int Conf IEEE Eng Med Biol Soc. 2020 Jul;2020:3351-3354. doi: 10.1109/EMBC44109.2020.9175964.

Robot-assisted motor training: assistance decreases exploration during reinforcement learning.机器人辅助运动训练：在强化学习过程中，辅助会减少探索行为。

Annu Int Conf IEEE Eng Med Biol Soc. 2014;2014:3516-20. doi: 10.1109/EMBC.2014.6944381.

Feedback Control of Functional Electrical Stimulation for 2-D Arm Reaching Movements.二维手臂运动的功能性电刺激反馈控制。

IEEE Trans Neural Syst Rehabil Eng. 2018 Oct;26(10):2033-2043. doi: 10.1109/TNSRE.2018.2853573. Epub 2018 Jul 5.

引用本文的文献

Data-Driven Dynamic Motion Planning for Practical FES-Controlled Reaching Motions in Spinal Cord Injury.数据驱动的动态运动规划在脊髓损伤中实用的 FES 控制的运动中的应用。

IEEE Trans Neural Syst Rehabil Eng. 2023;31:2246-2256. doi: 10.1109/TNSRE.2023.3272929. Epub 2023 May 11.

Improving the Learning Rate, Accuracy, and Workspace of Reinforcement Learning Controllers for a Musculoskeletal Model of the Human Arm.提高强化学习控制器在人类手臂骨骼肌肉模型中的学习率、准确性和工作空间。

IEEE Trans Neural Syst Rehabil Eng. 2022;30:30-39. doi: 10.1109/TNSRE.2021.3135471. Epub 2022 Jan 28.

IEEE Trans Neural Syst Rehabil Eng. 2021;29:1016-1025. doi: 10.1109/TNSRE.2021.3081056. Epub 2021 Jun 8.

The Optimal Adaptive-Based Neurofuzzy Control of the 3-DOF Musculoskeletal System of Human Arm in a 2D Plane.二维平面中人体手臂三自由度肌肉骨骼系统的最优自适应神经模糊控制

Appl Bionics Biomech. 2021 Apr 5;2021:5514693. doi: 10.1155/2021/5514693. eCollection 2021.

Sub-optimally Solving Actuator Redundancy in a Hybrid Neuroprosthetic System with a Multi-layer Neural Network Structure.在具有多层神经网络结构的混合神经假体系统中次优解决执行器冗余问题。

Int J Intell Robot Appl. 2019 Sep;3(3):298-313. doi: 10.1007/s41315-019-00100-8. Epub 2019 Aug 14.

Reinforcement Learning-Based End-to-End Parking for Automatic Parking System.基于强化学习的全自动泊车系统端到端泊车

Sensors (Basel). 2019 Sep 16;19(18):3996. doi: 10.3390/s19183996.

Holding Static Arm Configurations With Functional Electrical Stimulation: A Case Study.功能性电刺激下保持静态手臂姿势：病例研究。

IEEE Trans Neural Syst Rehabil Eng. 2018 Oct;26(10):2044-2052. doi: 10.1109/TNSRE.2018.2866226. Epub 2018 Aug 20.

本文引用的文献

Feedback control of arm movements using Neuro-Muscular Electrical Stimulation (NMES) combined with a lockable, passive exoskeleton for gravity compensation.使用神经肌肉电刺激（NMES）结合可锁定、被动式外骨骼进行重力补偿的手臂运动反馈控制。

Front Neurosci. 2014 Sep 2;8:262. doi: 10.3389/fnins.2014.00262. eCollection 2014.

Examining the effectiveness of intrathecal baclofen on spasticity in individuals with chronic spinal cord injury: a systematic review.鞘内注射巴氯芬对慢性脊髓损伤患者痉挛的疗效研究：一项系统评价。

J Spinal Cord Med. 2014 Jan;37(1):11-8. doi: 10.1179/2045772313Y.0000000102. Epub 2013 Nov 26.

Functional electrical stimulation mediated by iterative learning control and 3D robotics reduces motor impairment in chronic stroke.迭代学习控制和 3D 机器人介导的功能性电刺激可减少慢性中风的运动障碍。

J Neuroeng Rehabil. 2012 Jun 7;9:32. doi: 10.1186/1743-0003-9-32.

Online human training of a myoelectric prosthesis controller via actor-critic reinforcement learning.通过演员-评论家强化学习对肌电假肢控制器进行在线人工训练。

IEEE Int Conf Rehabil Robot. 2011;2011:5975338. doi: 10.1109/ICORR.2011.5975338.

Creating a Reinforcement Learning Controller for Functional Electrical Stimulation of a Human Arm.为人体手臂功能性电刺激创建强化学习控制器。

Yale Workshop Adapt Learn Syst. 2008;49326:1-6.

Application of the Actor-Critic Architecture to Functional Electrical Stimulation Control of a Human Arm.演员-评论家架构在人体手臂功能性电刺激控制中的应用。

Proc Innov Appl Artif Intell Conf. 2009;2009:165-172.

Optimization and evaluation of a proportional derivative controller for planar arm movement.平面手臂运动比例微分控制器的优化与评估。

J Biomech. 2010 Apr 19;43(6):1086-91. doi: 10.1016/j.jbiomech.2009.12.017. Epub 2010 Jan 25.

Feasibility of iterative learning control mediated by functional electrical stimulation for reaching after stroke.中风后借助功能性电刺激进行迭代学习控制以实现抓握动作的可行性。

Neurorehabil Neural Repair. 2009 Jul-Aug;23(6):559-68. doi: 10.1177/1545968308328718. Epub 2009 Feb 3.

Functional electrical stimulation after spinal cord injury: current use, therapeutic effects and future directions.脊髓损伤后的功能性电刺激：当前应用、治疗效果及未来方向。

Spinal Cord. 2008 Apr;46(4):255-74. doi: 10.1038/sj.sc.3102091. Epub 2007 Sep 11.

Functional electrical stimulation for neuromuscular applications.用于神经肌肉应用的功能性电刺激

Annu Rev Biomed Eng. 2005;7:327-60. doi: 10.1146/annurev.bioeng.6.040803.140103.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验