• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于内部奖励强化学习的多日记录任务学习脑机接口。

Task Learning Over Multi-Day Recording via Internally Rewarded Reinforcement Learning Based Brain Machine Interfaces.

出版信息

IEEE Trans Neural Syst Rehabil Eng. 2020 Dec;28(12):3089-3099. doi: 10.1109/TNSRE.2020.3039970. Epub 2021 Jan 28.

DOI:10.1109/TNSRE.2020.3039970
PMID:33232240
Abstract

Autonomous brain machine interfaces (BMIs) aim to enable paralyzed people to self-evaluate their movement intention to control external devices. Previous reinforcement learning (RL)-based decoders interpret the mapping between neural activity and movements using the external reward for well-trained subjects, and have not investigated the task learning procedure. The brain has developed a learning mechanism to identify the correct actions that lead to rewards in the new task. This internal guidance can be utilized to replace the external reference to advance BMIs as an autonomous system. In this study, we propose to build an internally rewarded reinforcement learning-based BMI framework using the multi-site recording to demonstrate the autonomous learning ability of the BMI decoder on the new task. We test the model on the neural data collected over multiple days while the rats were learning a new lever discrimination task. The primary motor cortex (M1) and medial prefrontal cortex (mPFC) spikes are interpreted by the proposed RL framework into the discrete lever press actions. The neural activity of the mPFC post the action duration is interpreted as the internal reward information, where a support vector machine is implemented to classify the reward vs. non-reward trials with a high accuracy of 87.5% across subjects. This internal reward is used to replace the external water reward to update the decoder, which is able to adapt to the nonstationary neural activity during subject learning. The multi-cortical recording allows us to take in more cortical recordings as input and uses internal critics to guide the decoder learning. Comparing with the classic decoder using M1 activity as the only input and external guidance, the proposed system with multi-cortical recordings shows a better decoding accuracy. More importantly, our internally rewarded decoder demonstrates the autonomous learning ability on the new task as the decoder successfully addresses the time-variant neural patterns while subjects are learning, and works asymptotically as the subjects' behavioral learning progresses. It reveals the potential of endowing BMIs with autonomous task learning ability in the RL framework.

摘要

自主脑机接口 (BMI) 旨在使瘫痪患者能够自我评估运动意图以控制外部设备。以前基于强化学习 (RL) 的解码器使用外部奖励来解释神经活动与运动之间的映射,这些解码器针对经过良好训练的对象进行了优化,但并未研究任务学习过程。大脑已经开发出一种学习机制,用于识别导致新任务中奖励的正确动作。这种内部指导可以被用来代替外部参考,从而推进作为自主系统的 BMI。在这项研究中,我们提出了一种基于内部奖励的强化学习 BMI 框架,该框架使用多部位记录来展示 BMI 解码器在新任务上的自主学习能力。我们在大鼠学习新的杠杆辨别任务的多天期间收集的神经数据上测试了该模型。主要运动皮层 (M1) 和内侧前额叶皮层 (mPFC) 尖峰通过所提出的 RL 框架被解释为离散的杠杆按压动作。在动作持续时间后,mPFC 的神经活动被解释为内部奖励信息,其中支持向量机实现了以 87.5%的高准确率对奖励与非奖励试验进行分类。这种内部奖励用于代替外部水奖励来更新解码器,解码器能够适应主体学习过程中不稳定的神经活动。多皮质记录允许我们接收更多皮质记录作为输入,并使用内部批评者来指导解码器学习。与仅使用 M1 活动作为唯一输入和外部指导的经典解码器相比,具有多皮质记录的所提出的系统显示出更好的解码准确性。更重要的是,我们的内部奖励解码器在新任务上表现出自主学习能力,因为解码器在主体学习时成功解决了时变的神经模式,并且随着主体行为学习的进展,它逐渐接近最佳状态。这揭示了在 RL 框架中为 BMI 赋予自主任务学习能力的潜力。

相似文献

1
Task Learning Over Multi-Day Recording via Internally Rewarded Reinforcement Learning Based Brain Machine Interfaces.基于内部奖励强化学习的多日记录任务学习脑机接口。
IEEE Trans Neural Syst Rehabil Eng. 2020 Dec;28(12):3089-3099. doi: 10.1109/TNSRE.2020.3039970. Epub 2021 Jan 28.
2
Estimating Reward Function from Medial Prefrontal Cortex Cortical Activity using Inverse Reinforcement Learning.基于逆强化学习从内侧前额叶皮质皮层活动估算奖励函数。
Annu Int Conf IEEE Eng Med Biol Soc. 2022 Jul;2022:3346-3349. doi: 10.1109/EMBC48229.2022.9871194.
3
Reinforcement Learning based Decoding Using Internal Reward for Time Delayed Task in Brain Machine Interfaces.基于强化学习的解码:利用内部奖励实现脑机接口中的时延任务
Annu Int Conf IEEE Eng Med Biol Soc. 2020 Jul;2020:3351-3354. doi: 10.1109/EMBC44109.2020.9175964.
4
Intermediate Sensory Feedback Assisted Multi-Step Neural Decoding for Reinforcement Learning Based Brain-Machine Interfaces.基于强化学习的脑机接口的中间感觉反馈辅助多步神经解码。
IEEE Trans Neural Syst Rehabil Eng. 2022;30:2834-2844. doi: 10.1109/TNSRE.2022.3210700. Epub 2022 Oct 20.
5
Audio-induced medial prefrontal cortical dynamics enhances coadaptive learning in brain-machine interfaces.音频诱导的内侧前额叶皮质动态增强了脑机接口中的共同适应学习。
J Neural Eng. 2023 Oct 17;20(5). doi: 10.1088/1741-2552/ad017d.
6
Feedback for reinforcement learning based brain-machine interfaces using confidence metrics.基于置信度指标的用于脑机接口的强化学习反馈
J Neural Eng. 2017 Jun;14(3):036016. doi: 10.1088/1741-2552/aa6317. Epub 2017 Feb 27.
7
State-space Model Based Inverse Reinforcement Learning for Reward Function Estimation in Brain-machine Interfaces.基于状态空间模型的脑机接口奖励函数估计逆强化学习
Annu Int Conf IEEE Eng Med Biol Soc. 2023 Jul;2023:1-4. doi: 10.1109/EMBC40787.2023.10340953.
8
A Kernel Reinforcement Learning Decoding Framework Integrating Neural and Feedback Signals for Brain Control.一种整合神经和反馈信号的核强化学习解码框架,用于脑控。
Annu Int Conf IEEE Eng Med Biol Soc. 2023 Jul;2023:1-4. doi: 10.1109/EMBC40787.2023.10340203.
9
Hierarchical Dynamical Model for Multiple Cortical Neural Decoding.用于多皮质神经解码的分层动态模型。
Neural Comput. 2021 Apr 13;33(5):1372-1401. doi: 10.1162/neco_a_01380.
10
Kernel Temporal Difference based Reinforcement Learning for Brain Machine Interfaces.基于核时差分的脑机接口强化学习。
Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov;2021:6721-6724. doi: 10.1109/EMBC46164.2021.9631086.

引用本文的文献

1
Neural Decoders Using Reinforcement Learning in Brain Machine Interfaces: A Technical Review.脑机接口中使用强化学习的神经解码器:技术综述
Front Syst Neurosci. 2022 Aug 26;16:836778. doi: 10.3389/fnsys.2022.836778. eCollection 2022.