Suppr超能文献

多传感器数据融合在使用演员-评论家强化学习方法对具有一对肌肉的矢状臂进行神经肌肉控制中的作用。

The role of multisensor data fusion in neuromuscular control of a sagittal arm with a pair of muscles using actor-critic reinforcement learning method.

作者信息

Golkhou V, Parnianpour M, Lucas C

机构信息

Department of Mechanical Engineering, Sharif University of Technology, Tehran, Iran.

出版信息

Technol Health Care. 2004;12(6):425-38.

Abstract

In this study, we consider the role of multisensor data fusion in neuromuscular control using an actor-critic reinforcement learning method. The model we use is a single link system actuated by a pair of muscles that are excited with alpha and gamma signals. Various physiological sensor information such as proprioception, spindle sensors, and Golgi tendon organs have been integrated to achieve an oscillatory movement with variable amplitude and frequency, while achieving a stable movement with minimum metabolic cost and coactivation. The system is highly nonlinear in all its physical and physiological attributes. Transmission delays are included in the afferent and efferent neural paths to account for a more accurate representation of the reflex loops. This paper proposes a reinforcement learning method with an Actor-Critic architecture instead of middle and low level of central nervous system (CNS). The Actor in this structure is a two layer feedforward neural network and the Critic is a model of the cerebellum. The Critic is trained by the State-Action-Reward-State-Action (SARSA) method. The Critic will train the Actor by supervisory learning based on previous experiences. The reinforcement signal in SARSA is evaluated based on available alternatives concerning the concept of multisensor data fusion. The effectiveness and the biological plausibility of the present model are demonstrated by several simulations. The system showed excellent tracking capability when we integrated the available sensor information. Addition of a penalty for activation of muscles resulted in much lower muscle coactivation while keeping the movement stable.

摘要

在本研究中,我们使用一种行为-评判强化学习方法来探讨多传感器数据融合在神经肌肉控制中的作用。我们所使用的模型是一个单连杆系统,由一对通过α和γ信号激发的肌肉驱动。整合了各种生理传感器信息,如本体感觉、纺锤体传感器和高尔基腱器官,以实现具有可变幅度和频率的振荡运动,同时以最小的代谢成本和共同激活实现稳定运动。该系统在其所有物理和生理属性方面都具有高度非线性。在传入和传出神经通路中纳入了传输延迟,以更准确地表示反射回路。本文提出了一种具有行为-评判架构的强化学习方法,以替代中枢神经系统(CNS)的中低级部分。此结构中的行为者是一个两层前馈神经网络,评判者是小脑模型。评判者通过状态-动作-奖励-状态-动作(SARSA)方法进行训练。评判者将根据以往经验通过监督学习来训练行为者。基于与多传感器数据融合概念相关的可用替代方案对SARSA中的强化信号进行评估。通过多次模拟验证了本模型的有效性和生物学合理性。当我们整合可用传感器信息时,该系统显示出出色的跟踪能力。在保持运动稳定的同时,对肌肉激活施加惩罚会导致肌肉共同激活大大降低。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验