Suppr超能文献

大脑如何通过强化学习训练来实现一种间歇性控制策略,以稳定安静的姿势。

How the brain can be trained to achieve an intermittent control strategy for stabilizing quiet stance by means of reinforcement learning.

机构信息

Graduate School of Engineering Science, Osaka University, 1-3 Machikaneyama, Toyonaka, Osaka, 5608531, Japan.

Istituto Italiano di Tecnologia, Via Enrico Melen 83, Bldg B, 16152, Genoa, Italy.

出版信息

Biol Cybern. 2024 Aug;118(3-4):229-248. doi: 10.1007/s00422-024-00993-0. Epub 2024 Jul 12.

Abstract

The stabilization of human quiet stance is achieved by a combination of the intrinsic elastic properties of ankle muscles and an active closed-loop activation of the ankle muscles, driven by the delayed feedback of the ongoing sway angle and the corresponding angular velocity in a way of a delayed proportional (P) and derivative (D) feedback controller. It has been shown that the active component of the stabilization process is likely to operate in an intermittent manner rather than as a continuous controller: the switching policy is defined in the phase-plane, which is divided in dangerous and safe regions, separated by appropriate switching boundaries. When the state enters a dangerous region, the delayed PD control is activated, and it is switched off when it enters a safe region, leaving the system to evolve freely. In comparison with continuous feedback control, the intermittent mechanism is more robust and capable to better reproduce postural sway patterns in healthy people. However, the superior performance of the intermittent control paradigm as well as its biological plausibility, suggested by experimental evidence of the intermittent activation of the ankle muscles, leaves open the quest of a feasible learning process, by which the brain can identify the appropriate state-dependent switching policy and tune accordingly the P and D parameters. In this work, it is shown how such a goal can be achieved with a reinforcement motor learning paradigm, building upon the evidence that, in general, the basal ganglia are known to play a central role in reinforcement learning for action selection and, in particular, were found to be specifically involved in postural stabilization.

摘要

人体静息姿势的稳定是通过踝关节肌肉的固有弹性特性和踝关节肌肉的主动闭环激活相结合来实现的,这种激活是由当前摆动角度和相应角速度的延迟反馈以延迟比例(P)和导数(D)反馈控制器的方式驱动的。已经表明,稳定过程的主动分量可能以间歇方式而非连续控制器运行:切换策略在相平面中定义,相平面分为危险区和安全区,由适当的切换边界隔开。当状态进入危险区域时,延迟 PD 控制被激活,当它进入安全区域时,它被关闭,让系统自由演化。与连续反馈控制相比,间歇机制更具鲁棒性,能够更好地再现健康人的姿势摆动模式。然而,间歇控制范式的优越性能及其生物合理性,即从踝关节肌肉间歇性激活的实验证据中得到证实,使得探索可行的学习过程成为可能,大脑可以通过该过程识别适当的状态相关切换策略,并相应调整 P 和 D 参数。在这项工作中,展示了如何通过强化运动学习范例来实现这一目标,该范例基于以下证据:一般来说,基底神经节被认为在动作选择的强化学习中起着核心作用,特别是在姿势稳定中被发现特别涉及。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96b0/11289178/9bbbc1b9d2f5/422_2024_993_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验