Suppr超能文献

一个扩展的基底神经节强化学习模型,用于理解血清素和多巴胺在风险决策、奖励预测和惩罚学习中的贡献。

An extended reinforcement learning model of basal ganglia to understand the contributions of serotonin and dopamine in risk-based decision making, reward prediction, and punishment learning.

机构信息

Department of Biotechnology, Indian Institute of Technology - Madras Chennai, India.

Department of Computer Science and Engineering, Indian Institute of Technology - Madras Chennai, India.

出版信息

Front Comput Neurosci. 2014 Apr 16;8:47. doi: 10.3389/fncom.2014.00047. eCollection 2014.

Abstract

Although empirical and neural studies show that serotonin (5HT) plays many functional roles in the brain, prior computational models mostly focus on its role in behavioral inhibition. In this study, we present a model of risk based decision making in a modified Reinforcement Learning (RL)-framework. The model depicts the roles of dopamine (DA) and serotonin (5HT) in Basal Ganglia (BG). In this model, the DA signal is represented by the temporal difference error (δ), while the 5HT signal is represented by a parameter (α) that controls risk prediction error. This formulation that accommodates both 5HT and DA reconciles some of the diverse roles of 5HT particularly in connection with the BG system. We apply the model to different experimental paradigms used to study the role of 5HT: (1) Risk-sensitive decision making, where 5HT controls risk assessment, (2) Temporal reward prediction, where 5HT controls time-scale of reward prediction, and (3) Reward/Punishment sensitivity, in which the punishment prediction error depends on 5HT levels. Thus the proposed integrated RL model reconciles several existing theories of 5HT and DA in the BG.

摘要

尽管实证和神经研究表明 5-羟色胺(5HT)在大脑中发挥着许多功能作用,但之前的计算模型主要关注其在行为抑制中的作用。在这项研究中,我们在改进的强化学习(RL)框架中提出了一种基于风险的决策模型。该模型描绘了多巴胺(DA)和 5-羟色胺(5HT)在基底神经节(BG)中的作用。在这个模型中,DA 信号由时间差分误差(δ)表示,而 5HT 信号由一个参数(α)表示,该参数控制风险预测误差。这种同时包含 5HT 和 DA 的表述方式,调和了 5HT 的一些不同作用,特别是与 BG 系统的关系。我们将该模型应用于研究 5HT 作用的不同实验范式:(1)风险敏感决策,其中 5HT 控制风险评估,(2)时间奖励预测,其中 5HT 控制奖励预测的时间尺度,以及(3)奖励/惩罚敏感性,其中惩罚预测误差取决于 5HT 水平。因此,所提出的综合 RL 模型调和了 BG 中几个现有的 5HT 和 DA 理论。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1079/3997037/1f37a92def17/fncom-08-00047-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验