Merel Josh, Carlson David, Paninski Liam, Cunningham John P
Neurobiology and Behavior program, Columbia University, New York, New York, United States of America.
Center for Theoretical Neuroscience, Columbia University, New York, New York, United States of America.
PLoS Comput Biol. 2016 May 18;12(5):e1004948. doi: 10.1371/journal.pcbi.1004948. eCollection 2016 May.
Neuroprosthetic brain-computer interfaces function via an algorithm which decodes neural activity of the user into movements of an end effector, such as a cursor or robotic arm. In practice, the decoder is often learned by updating its parameters while the user performs a task. When the user's intention is not directly observable, recent methods have demonstrated value in training the decoder against a surrogate for the user's intended movement. Here we show that training a decoder in this way is a novel variant of an imitation learning problem, where an oracle or expert is employed for supervised training in lieu of direct observations, which are not available. Specifically, we describe how a generic imitation learning meta-algorithm, dataset aggregation (DAgger), can be adapted to train a generic brain-computer interface. By deriving existing learning algorithms for brain-computer interfaces in this framework, we provide a novel analysis of regret (an important metric of learning efficacy) for brain-computer interfaces. This analysis allows us to characterize the space of algorithmic variants and bounds on their regret rates. Existing approaches for decoder learning have been performed in the cursor control setting, but the available design principles for these decoders are such that it has been impossible to scale them to naturalistic settings. Leveraging our findings, we then offer an algorithm that combines imitation learning with optimal control, which should allow for training of arbitrary effectors for which optimal control can generate goal-oriented control. We demonstrate this novel and general BCI algorithm with simulated neuroprosthetic control of a 26 degree-of-freedom model of an arm, a sophisticated and realistic end effector.
神经假体脑机接口通过一种算法发挥作用,该算法将用户的神经活动解码为末端执行器的运动,如光标或机械臂的运动。在实际应用中,解码器通常在用户执行任务时通过更新其参数来学习。当用户的意图无法直接观察到时,最近的方法已证明在针对用户预期运动的替代物训练解码器方面具有价值。在这里,我们表明以这种方式训练解码器是模仿学习问题的一种新颖变体,其中使用一个预言者或专家进行监督训练,以代替无法获得的直接观察。具体而言,我们描述了一种通用的模仿学习元算法——数据集聚合(DAgger),如何能够被改编用于训练通用的脑机接口。通过在这个框架中推导现有的脑机接口学习算法,我们为脑机接口提供了一种关于遗憾(学习效果的一个重要指标)的新颖分析。这种分析使我们能够刻画算法变体的空间及其遗憾率的界限。现有的解码器学习方法是在光标控制设置中进行的,但这些解码器可用的设计原则使得无法将它们扩展到自然主义设置。利用我们的发现,我们随后提供了一种将模仿学习与最优控制相结合的算法,这应该允许对任意末端执行器进行训练,对于这些末端执行器,最优控制可以生成面向目标的控制。我们用一个26自由度的手臂模型(一个复杂且逼真的末端执行器)的模拟神经假体控制来演示这种新颖且通用的脑机接口算法。