基于演员-评论家强化学习方法对矢状臂点对点运动和振荡运动的神经肌肉控制

Neuromuscular control of the point to point and oscillatory movements of a sagittal arm with the actor-critic reinforcement learning method.

作者信息

Golkhou Vahid, Parnianpour Mohamad, Lucas Caro

机构信息

Biomechanics Laboratory, Department of Mechanical Engineering, Sharif University of Technology, Azadi Avenue, P.O. Box 11365-9567, Tehran, Iran.

出版信息

Comput Methods Biomech Biomed Engin. 2005 Apr;8(2):103-13. doi: 10.1080/10255840500167952.

DOI:10.1080/10255840500167952

PMID:16154874

Abstract

In this study, we have used a single link system with a pair of muscles that are excited with alpha and gamma signals to achieve both point to point and oscillatory movements with variable amplitude and frequency.The system is highly nonlinear in all its physical and physiological attributes. The major physiological characteristics of this system are simultaneous activation of a pair of nonlinear muscle-like-actuators for control purposes, existence of nonlinear spindle-like sensors and Golgi tendon organ-like sensor, actions of gravity and external loading. Transmission delays are included in the afferent and efferent neural paths to account for a more accurate representation of the reflex loops.A reinforcement learning method with an actor-critic (AC) architecture instead of middle and low level of central nervous system (CNS), is used to track a desired trajectory. The actor in this structure is a two layer feedforward neural network and the critic is a model of the cerebellum. The critic is trained by state-action-reward-state-action (SARSA) method. The critic will train the actor by supervisory learning based on the prior experiences. Simulation studies of oscillatory movements based on the proposed algorithm demonstrate excellent tracking capability and after 280 epochs the RMS error for position and velocity profiles were 0.02, 0.04 rad and rad/s, respectively.

摘要

在本研究中，我们使用了一个单链路系统，该系统带有一对由α和γ信号激发的肌肉，以实现具有可变幅度和频率的点对点运动以及振荡运动。该系统在其所有物理和生理属性方面都具有高度非线性。该系统的主要生理特征包括为控制目的同时激活一对非线性肌肉样致动器、存在非线性纺锤体样传感器和高尔基腱器官样传感器、重力和外部负载的作用。传入和传出神经路径中包含传输延迟，以更准确地表示反射回路。使用一种具有演员-评论家（AC）架构的强化学习方法来代替中枢神经系统（CNS）的中低级部分，以跟踪期望轨迹。此结构中的演员是一个两层前馈神经网络，评论家是小脑模型。评论家通过状态-动作-奖励-状态-动作（SARSA）方法进行训练。评论家将基于先前经验通过监督学习来训练演员。基于所提出算法的振荡运动仿真研究展示了出色的跟踪能力，在280个训练周期后，位置和速度曲线的均方根误差分别为0.02、0.04弧度和弧度/秒。

相似文献

Neuromuscular control of the point to point and oscillatory movements of a sagittal arm with the actor-critic reinforcement learning method.

Comput Methods Biomech Biomed Engin. 2005 Apr;8(2):103-13. doi: 10.1080/10255840500167952.

The role of multisensor data fusion in neuromuscular control of a sagittal arm with a pair of muscles using actor-critic reinforcement learning method.

Technol Health Care. 2004;12(6):425-38.

Stability and movement of a one-link neuromusculoskeletal sagittal arm.

IEEE Trans Biomed Eng. 1993 Jun;40(6):541-8. doi: 10.1109/10.237673.

Reflex regulation of antagonist muscles for control of joint equilibrium position.

IEEE Trans Neural Syst Rehabil Eng. 2005 Mar;13(1):60-71. doi: 10.1109/TNSRE.2004.841882.

A novel theoretical framework for the dynamic stability analysis, movement control, and trajectory generation in a multisegment biomechanical model.

J Biomech Eng. 2009 Jan;131(1):011002. doi: 10.1115/1.3002763.

Computation of inverse functions in a model of cerebellar and reflex pathways allows to control a mobile mechanical segment.

Neuroscience. 2005;133(1):29-49. doi: 10.1016/j.neuroscience.2004.09.048. Epub 2005 Apr 22.

Control of single-joint movements with a reversal.

J Electromyogr Kinesiol. 2005 Aug;15(4):406-17. doi: 10.1016/j.jelekin.2004.09.004. Epub 2004 Nov 6.

Control of nonaffine nonlinear discrete-time systems using reinforcement-learning-based linearly parameterized neural networks.

IEEE Trans Syst Man Cybern B Cybern. 2008 Aug;38(4):994-1001. doi: 10.1109/TSMCB.2008.926607.

Spatio-temporal separation of roll and pitch balance-correcting commands in humans.

J Neurophysiol. 2005 Nov;94(5):3143-58. doi: 10.1152/jn.00538.2004. Epub 2005 Jul 20.

Reinforcement learning for a biped robot based on a CPG-actor-critic method.

Neural Netw. 2007 Aug;20(6):723-35. doi: 10.1016/j.neunet.2007.01.002. Epub 2007 Feb 20.

引用本文的文献

Sample-Efficient Reinforcement Learning Controller for Deep Brain Stimulation in Parkinson's Disease.

ArXiv. 2025 Jul 8:arXiv:2507.06326v1.

Characterizing Motor Control of Mastication With Soft Actor-Critic.

Front Hum Neurosci. 2020 May 26;14:188. doi: 10.3389/fnhum.2020.00188. eCollection 2020.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于演员-评论家强化学习方法对矢状臂点对点运动和振荡运动的神经肌肉控制

Neuromuscular control of the point to point and oscillatory movements of a sagittal arm with the actor-critic reinforcement learning method.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献