基于逆强化学习从内侧前额叶皮质皮层活动估算奖励函数。

Estimating Reward Function from Medial Prefrontal Cortex Cortical Activity using Inverse Reinforcement Learning.

出版信息

Annu Int Conf IEEE Eng Med Biol Soc. 2022 Jul;2022:3346-3349. doi: 10.1109/EMBC48229.2022.9871194.

DOI:10.1109/EMBC48229.2022.9871194

Abstract

Reinforcement learning (RL)-based brain-machine interfaces (BMIs) learn the mapping from neural signals to subjects' intention using a reward signal. External rewards (water or food) or internal rewards extracted from neural activity are leveraged to update the parameters of decoders in the existing RL-based BMI framework. However, for complex tasks, the design of external reward could be difficult, which may not fully reflect the subject's own evaluation internally. It is important to obtain an internal reward model from neural activity to access subject's internal evaluation when the subject is performing the task through trial and error. In this paper, we propose to use an inverse reinforcement learning (IRL) method to estimate the internal reward function interpreted from the brain to assist the update of the decoders. Specifically, the inverse Q-learning (IQL) algorithm is applied to extract internal reward information from real data collected from medial prefrontal cortex (mPFC) when a rat was learning a two-lever-press discrimination task. Such an internal reward information is validated by checking whether it can guide the training of the RL decoder to complete movement task. Compared with the RL decoder trained with the external reward, our approach achieves a similar decoding performance. This preliminary result validates the effectiveness of using IRL to obtain the internal reward model. It reveals the potential of estimating internal reward model to improve the design of autonomous learning BMIs.

摘要

基于强化学习（RL）的脑机接口（BMI）使用奖励信号来学习从神经信号到主体意图的映射。外部奖励（水或食物）或从神经活动中提取的内部奖励被用于更新现有基于 RL 的 BMI 框架中的解码器的参数。然而，对于复杂的任务，外部奖励的设计可能很困难，这可能无法完全反映主体内部的自我评估。当主体通过试错执行任务时，从神经活动中获得内部奖励模型以获取主体的内部评估非常重要。在本文中，我们提出使用逆强化学习（IRL）方法来估计从大脑中解释的内部奖励函数，以辅助解码器的更新。具体来说，应用逆 Q 学习（IQL）算法从大鼠学习双杠杆按压辨别任务时从中脑前额叶皮层（mPFC）收集的真实数据中提取内部奖励信息。通过检查它是否可以指导 RL 解码器的训练来完成运动任务来验证这种内部奖励信息。与使用外部奖励训练的 RL 解码器相比，我们的方法实现了类似的解码性能。这初步验证了使用 IRL 获得内部奖励模型的有效性。它揭示了估计内部奖励模型以改善自主学习 BMI 设计的潜力。

相似文献

Estimating Reward Function from Medial Prefrontal Cortex Cortical Activity using Inverse Reinforcement Learning.基于逆强化学习从内侧前额叶皮质皮层活动估算奖励函数。

Annu Int Conf IEEE Eng Med Biol Soc. 2022 Jul;2022:3346-3349. doi: 10.1109/EMBC48229.2022.9871194.

Task Learning Over Multi-Day Recording via Internally Rewarded Reinforcement Learning Based Brain Machine Interfaces.基于内部奖励强化学习的多日记录任务学习脑机接口。

IEEE Trans Neural Syst Rehabil Eng. 2020 Dec;28(12):3089-3099. doi: 10.1109/TNSRE.2020.3039970. Epub 2021 Jan 28.

State-space Model Based Inverse Reinforcement Learning for Reward Function Estimation in Brain-machine Interfaces.基于状态空间模型的脑机接口奖励函数估计逆强化学习

Annu Int Conf IEEE Eng Med Biol Soc. 2023 Jul;2023:1-4. doi: 10.1109/EMBC40787.2023.10340953.

Reinforcement Learning based Decoding Using Internal Reward for Time Delayed Task in Brain Machine Interfaces.基于强化学习的解码：利用内部奖励实现脑机接口中的时延任务

Annu Int Conf IEEE Eng Med Biol Soc. 2020 Jul;2020:3351-3354. doi: 10.1109/EMBC44109.2020.9175964.

Intermediate Sensory Feedback Assisted Multi-Step Neural Decoding for Reinforcement Learning Based Brain-Machine Interfaces.基于强化学习的脑机接口的中间感觉反馈辅助多步神经解码。

IEEE Trans Neural Syst Rehabil Eng. 2022;30:2834-2844. doi: 10.1109/TNSRE.2022.3210700. Epub 2022 Oct 20.

Audio-induced medial prefrontal cortical dynamics enhances coadaptive learning in brain-machine interfaces.音频诱导的内侧前额叶皮质动态增强了脑机接口中的共同适应学习。

J Neural Eng. 2023 Oct 17;20(5). doi: 10.1088/1741-2552/ad017d.

Feedback for reinforcement learning based brain-machine interfaces using confidence metrics.基于置信度指标的用于脑机接口的强化学习反馈

J Neural Eng. 2017 Jun;14(3):036016. doi: 10.1088/1741-2552/aa6317. Epub 2017 Feb 27.

A Kernel Reinforcement Learning Decoding Framework Integrating Neural and Feedback Signals for Brain Control.一种整合神经和反馈信号的核强化学习解码框架，用于脑控。

Annu Int Conf IEEE Eng Med Biol Soc. 2023 Jul;2023:1-4. doi: 10.1109/EMBC40787.2023.10340203.

Kernel Temporal Difference based Reinforcement Learning for Brain Machine Interfaces.基于核时差分的脑机接口强化学习。

Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov;2021:6721-6724. doi: 10.1109/EMBC46164.2021.9631086.

Multivariate Encoding Analysis of Medial Prefrontal Cortex Cortical Activity during Task Learning.任务学习过程中内侧前额叶皮质皮层活动的多元编码分析。

Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov;2021:6699-6702. doi: 10.1109/EMBC46164.2021.9630322.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于逆强化学习从内侧前额叶皮质皮层活动估算奖励函数。

Estimating Reward Function from Medial Prefrontal Cortex Cortical Activity using Inverse Reinforcement Learning.

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献