迈向自主脑机接口：整合感觉运动奖赏调制与强化学习

Toward an autonomous brain machine interface: integrating sensorimotor reward modulation and reinforcement learning.

作者信息

Marsh Brandi T, Tarigoppula Venkata S Aditya, Chen Chen, Francis Joseph T

机构信息

Joint Program in Biomedical Engineering, New York University-Polytechnic School of Engineering and State University of New York, Downstate Medical Center.

Department of Physiology and Pharmacology.

出版信息

J Neurosci. 2015 May 13;35(19):7374-87. doi: 10.1523/JNEUROSCI.1802-14.2015.

DOI:10.1523/JNEUROSCI.1802-14.2015

PMID:25972167

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6705437/

Abstract

For decades, neurophysiologists have worked on elucidating the function of the cortical sensorimotor control system from the standpoint of kinematics or dynamics. Recently, computational neuroscientists have developed models that can emulate changes seen in the primary motor cortex during learning. However, these simulations rely on the existence of a reward-like signal in the primary sensorimotor cortex. Reward modulation of the primary sensorimotor cortex has yet to be characterized at the level of neural units. Here we demonstrate that single units/multiunits and local field potentials in the primary motor (M1) cortex of nonhuman primates (Macaca radiata) are modulated by reward expectation during reaching movements and that this modulation is present even while subjects passively view cursor motions that are predictive of either reward or nonreward. After establishing this reward modulation, we set out to determine whether we could correctly classify rewarding versus nonrewarding trials, on a moment-to-moment basis. This reward information could then be used in collaboration with reinforcement learning principles toward an autonomous brain-machine interface. The autonomous brain-machine interface would use M1 for both decoding movement intention and extraction of reward expectation information as evaluative feedback, which would then update the decoding algorithm as necessary. In the work presented here, we show that this, in theory, is possible.

摘要

几十年来，神经生理学家一直从运动学或动力学的角度致力于阐明皮质感觉运动控制系统的功能。最近，计算神经科学家开发出了能够模拟学习过程中初级运动皮层所出现变化的模型。然而，这些模拟依赖于初级感觉运动皮层中存在类似奖励的信号。初级感觉运动皮层的奖励调制在神经单元层面尚未得到表征。在此，我们证明，在非人类灵长类动物（恒河猴）的初级运动（M1）皮层中，单个神经元/多个神经元以及局部场电位在伸手动作期间会受到奖励预期的调制，并且即使在受试者被动观看预示奖励或无奖励的光标运动时，这种调制依然存在。在确定了这种奖励调制之后，我们着手判断能否在每时每刻正确区分奖励性试验和非奖励性试验。然后，这种奖励信息可与强化学习原理相结合，用于构建自主脑机接口。自主脑机接口将利用M1来解码运动意图并提取奖励预期信息作为评估反馈，进而在必要时更新解码算法。在本文所展示的研究中，我们表明，从理论上讲这是可行的。

相似文献

Toward an autonomous brain machine interface: integrating sensorimotor reward modulation and reinforcement learning.迈向自主脑机接口：整合感觉运动奖赏调制与强化学习

J Neurosci. 2015 May 13;35(19):7374-87. doi: 10.1523/JNEUROSCI.1802-14.2015.

Feedback for reinforcement learning based brain-machine interfaces using confidence metrics.基于置信度指标的用于脑机接口的强化学习反馈

J Neural Eng. 2017 Jun;14(3):036016. doi: 10.1088/1741-2552/aa6317. Epub 2017 Feb 27.

Mirror neurons are modulated by grip force and reward expectation in the sensorimotor cortices (S1, M1, PMd, PMv).镜像神经元在感觉运动皮层（S1、M1、PMd、PMv）中受到握力和奖励预期的调节。

Sci Rep. 2021 Aug 5;11(1):15959. doi: 10.1038/s41598-021-95536-z.

Reward Expectation Modulates Local Field Potentials, Spiking Activity and Spike-Field Coherence in the Primary Motor Cortex.奖励预期调节初级运动皮层的局部场电位、尖峰活动和尖峰-场相干性。

eNeuro. 2019 Jun 26;6(3). doi: 10.1523/ENEURO.0178-19.2019. Print 2019 May/Jun.

Task Learning Over Multi-Day Recording via Internally Rewarded Reinforcement Learning Based Brain Machine Interfaces.基于内部奖励强化学习的多日记录任务学习脑机接口。

IEEE Trans Neural Syst Rehabil Eng. 2020 Dec;28(12):3089-3099. doi: 10.1109/TNSRE.2020.3039970. Epub 2021 Jan 28.

Paradigm Shift in Sensorimotor Control Research and Brain Machine Interface Control: The Influence of Context on Sensorimotor Representations.感觉运动控制研究与脑机接口控制的范式转变：情境对感觉运动表征的影响。

Front Neurosci. 2018 Sep 10;12:579. doi: 10.3389/fnins.2018.00579. eCollection 2018.

Near Perfect Neural Critic from Motor Cortical Activity Toward an Autonomously Updating Brain Machine Interface.基于运动皮层活动构建近乎完美的神经评判器以实现自主更新的脑机接口

Annu Int Conf IEEE Eng Med Biol Soc. 2018 Jul;2018:73-76. doi: 10.1109/EMBC.2018.8512274.

Tracking Neural Modulation Depth by Dual Sequential Monte Carlo Estimation on Point Processes for Brain-Machine Interfaces.基于点过程的双序贯蒙特卡罗估计在脑机接口中跟踪神经调制深度

IEEE Trans Biomed Eng. 2016 Aug;63(8):1728-41. doi: 10.1109/TBME.2015.2500585. Epub 2015 Nov 13.

Noise-Correlation Is Modulated by Reward Expectation in the Primary Motor Cortex Bilaterally During Manual and Observational Tasks in Primates.在灵长类动物的手动和观察任务中，双侧初级运动皮层的噪声相关性受奖励期望调节。

Front Behav Neurosci. 2020 Dec 2;14:541920. doi: 10.3389/fnbeh.2020.541920. eCollection 2020.

Intermediate Sensory Feedback Assisted Multi-Step Neural Decoding for Reinforcement Learning Based Brain-Machine Interfaces.基于强化学习的脑机接口的中间感觉反馈辅助多步神经解码。

IEEE Trans Neural Syst Rehabil Eng. 2022;30:2834-2844. doi: 10.1109/TNSRE.2022.3210700. Epub 2022 Oct 20.

引用本文的文献

Reward signals in the motor cortex: from biology to neurotechnology.运动皮层中的奖赏信号：从生物学到神经技术

Nat Commun. 2025 Feb 3;16(1):1307. doi: 10.1038/s41467-024-55016-0.

A neural basis of choking under pressure.压力下窒息的神经基础。

Neuron. 2024 Oct 23;112(20):3424-3433.e8. doi: 10.1016/j.neuron.2024.08.012. Epub 2024 Sep 12.

Neural representation and modulation of volitional motivation in response to escalating efforts.意志努力下不断增加的任务努力程度的神经表示和调制。

J Physiol. 2023 Feb;601(3):631-645. doi: 10.1113/JP283915. Epub 2023 Jan 13.

Neural Decoders Using Reinforcement Learning in Brain Machine Interfaces: A Technical Review.脑机接口中使用强化学习的神经解码器：技术综述

Front Syst Neurosci. 2022 Aug 26;16:836778. doi: 10.3389/fnsys.2022.836778. eCollection 2022.

A multisynaptic pathway from the ventral midbrain toward spinal motoneurons in monkeys.猴脑腹侧中脑至脊髓运动神经元的多突触通路。

J Physiol. 2022 Apr;600(7):1731-1752. doi: 10.1113/JP282429. Epub 2022 Feb 17.

Cell-type-specific responses to associative learning in the primary motor cortex.初级运动皮层中与联想学习相关的细胞类型特异性反应。

Elife. 2022 Feb 3;11:e72549. doi: 10.7554/eLife.72549.

Normalization by valence and motivational intensity in the sensorimotor cortices (PMd, M1, and S1).感觉运动皮层（PMd、M1 和 S1）中的效价和动机强度归一化。

Sci Rep. 2021 Dec 20;11(1):24221. doi: 10.1038/s41598-021-03200-3.

Customizing skills for assistive robotic manipulators, an inverse reinforcement learning approach with error-related potentials.辅助机器人操纵器的技能定制，一种具有错误相关电位的逆强化学习方法。

Commun Biol. 2021 Dec 16;4(1):1406. doi: 10.1038/s42003-021-02891-8.

Theoretical Perspective on an Ideomotor Brain-Computer Interface: Toward a Naturalistic and Non-invasive Brain-Computer Interface Paradigm Based on Action-Effect Representation.关于意动脑机接口的理论视角：迈向基于动作-效应表征的自然主义和非侵入性脑机接口范式。

Front Hum Neurosci. 2021 Oct 28;15:732764. doi: 10.3389/fnhum.2021.732764. eCollection 2021.

Cortical preparatory activity during motor learning reflects visuomotor retention deficits after punishment feedback.运动学习过程中的皮质准备活动反映了惩罚反馈后视动保留缺陷。

Exp Brain Res. 2021 Nov;239(11):3243-3254. doi: 10.1007/s00221-021-06200-x. Epub 2021 Aug 28.

本文引用的文献

Voltage-sensitive dye imaging of primary motor cortex activity produced by ventral tegmental area stimulation.腹侧被盖区刺激引起初级运动皮层活动的电压敏感染料成像。

J Neurosci. 2014 Jun 25;34(26):8894-903. doi: 10.1523/JNEUROSCI.5286-13.2014.

Using reinforcement learning to provide stable brain-machine interface control despite neural input reorganization.利用强化学习在神经输入重组的情况下提供稳定的脑机接口控制。

PLoS One. 2014 Jan 30;9(1):e87253. doi: 10.1371/journal.pone.0087253. eCollection 2014.

Towards autonomous neuroprosthetic control using Hebbian reinforcement learning.使用赫布强化学习实现自主神经假肢控制。

J Neural Eng. 2013 Dec;10(6):066005. doi: 10.1088/1741-2560/10/6/066005. Epub 2013 Oct 8.

Influence of spiking activity on cortical local field potentials.锋电位活动对皮质局部场电位的影响。

J Physiol. 2013 Nov 1;591(21):5291-303. doi: 10.1113/jphysiol.2013.258228. Epub 2013 Aug 27.

Use of frontal lobe hemodynamics as reinforcement signals to an adaptive controller.利用额叶血流动力学作为自适应控制器的强化信号。

PLoS One. 2013 Jul 22;8(7):e69541. doi: 10.1371/journal.pone.0069541. Print 2013.

Brain-Machine Interface control of a robot arm using actor-critic rainforcement learning.使用演员-评论家强化学习对机器人手臂进行脑机接口控制。

Annu Int Conf IEEE Eng Med Biol Soc. 2012;2012:4108-11. doi: 10.1109/EMBC.2012.6346870.

Properties of a temporal difference reinforcement learning brain machine interface driven by a simulated motor cortex.由模拟运动皮层驱动的时间差分强化学习脑机接口的特性

Annu Int Conf IEEE Eng Med Biol Soc. 2012;2012:3284-7. doi: 10.1109/EMBC.2012.6346666.

Towards a naturalistic brain-machine interface: hybrid torque and position control allows generalization to novel dynamics.迈向自然的脑机接口：混合转矩和位置控制允许对新动力学进行泛化。

PLoS One. 2013;8(1):e52286. doi: 10.1371/journal.pone.0052286. Epub 2013 Jan 24.

M1 corticospinal mirror neurons and their role in movement suppression during action observation.M1 皮质脊髓镜神经元及其在观察动作时对运动的抑制作用。

Curr Biol. 2013 Feb 4;23(3):236-43. doi: 10.1016/j.cub.2012.12.006. Epub 2013 Jan 3.

Task-dependent changes in cross-level coupling between single neurons and oscillatory activity in multiscale networks.任务相关的单神经元和多尺度网络中振荡活动的跨层耦合的变化。

PLoS Comput Biol. 2012;8(12):e1002809. doi: 10.1371/journal.pcbi.1002809. Epub 2012 Dec 20.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验