School of Engineering Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada.
Department of Mechanical Engineering, Stanford University, Stanford, CA 94305, USA.
Curr Biol. 2022 May 23;32(10):2222-2232.e5. doi: 10.1016/j.cub.2022.04.015. Epub 2022 May 9.
Our nervous systems can learn optimal control policies in response to changes to our bodies, tasks, and movement contexts. For example, humans can learn to adapt their control policy in walking contexts where the energy-optimal policy is shifted along variables such as step frequency or step width. However, it is unclear how the nervous system determines which ways to adapt its control policy. Here, we asked how human participants explore through variations in their control policy to identify more optimal policies in new contexts. We created new contexts using exoskeletons that apply assistive torques to each ankle at each walking step. We analyzed four variables that spanned the levels of the whole movement, the joint, and the muscle: step frequency, ankle angle range, total soleus activity, and total medial gastrocnemius activity. We found that, across all of these analyzed variables, variability increased upon initial exposure to new contexts and then decreased with experience. This led to adaptive changes in the magnitude of specific variables, and these changes were correlated with reduced energetic cost. The timescales by which adaptive changes progressed and variability decreased were faster for some variables than others, suggesting a reduced search space within which the nervous system continues to optimize its policy. These collective findings support the principle that exploration through general variability leads to specific adaptation toward optimal movement policies.
我们的神经系统可以学习最优控制策略,以应对身体、任务和运动环境的变化。例如,人类可以学习在步行环境中适应控制策略,在这种环境中,能量最优策略会沿着步频或步幅等变量发生变化。然而,神经系统如何确定调整控制策略的方向还不清楚。在这里,我们研究了人类参与者如何通过控制策略的变化来探索,以在新环境中找到更优的策略。我们使用外骨骼为每个踝关节在每个步行步施加辅助扭矩来创建新环境。我们分析了跨越整个运动、关节和肌肉水平的四个变量:步频、踝关节角度范围、比目鱼肌总活动和腓肠肌内侧总活动。我们发现,在所有这些分析的变量中,初始接触新环境时的变异性增加,然后随着经验的增加而降低。这导致了特定变量的适应性变化,这些变化与能量消耗的降低有关。适应性变化和变异性降低的时间尺度对于某些变量比其他变量更快,这表明神经系统在继续优化其策略时,搜索空间减小。这些综合发现支持了这样一种原则,即通过普遍的变异性进行探索会导致朝着最优运动策略的具体适应。