Suppr超能文献

在进行概率决策时,老鼠表现出随机且有效的动作转换。

Mice exhibit stochastic and efficient action switching during probabilistic decision making.

机构信息

Department of Neurobiology, Harvard Medical School, Boston, MA 02115.

HHMI, Harvard Medical School, Boston, MA 02115.

出版信息

Proc Natl Acad Sci U S A. 2022 Apr 12;119(15):e2113961119. doi: 10.1073/pnas.2113961119. Epub 2022 Apr 6.

Abstract

In probabilistic and nonstationary environments, individuals must use internal and external cues to flexibly make decisions that lead to desirable outcomes. To gain insight into the process by which animals choose between actions, we trained mice in a task with time-varying reward probabilities. In our implementation of such a two-armed bandit task, thirsty mice use information about recent action and action–outcome histories to choose between two ports that deliver water probabilistically. Here we comprehensively modeled choice behavior in this task, including the trial-to-trial changes in port selection, i.e., action switching behavior. We find that mouse behavior is, at times, deterministic and, at others, apparently stochastic. The behavior deviates from that of a theoretically optimal agent performing Bayesian inference in a hidden Markov model (HMM). We formulate a set of models based on logistic regression, reinforcement learning, and sticky Bayesian inference that we demonstrate are mathematically equivalent and that accurately describe mouse behavior. The switching behavior of mice in the task is captured in each model by a stochastic action policy, a history-dependent representation of action value, and a tendency to repeat actions despite incoming evidence. The models parsimoniously capture behavior across different environmental conditionals by varying the stickiness parameter, and like the mice, they achieve nearly maximal reward rates. These results indicate that mouse behavior reaches near-maximal performance with reduced action switching and can be described by a set of equivalent models with a small number of relatively fixed parameters.

摘要

在概率和非平稳环境中,个体必须使用内部和外部线索来灵活地做出决策,以获得理想的结果。为了深入了解动物在行动之间进行选择的过程,我们在一项具有时变奖励概率的任务中对老鼠进行了训练。在我们实施的这种双臂强盗任务中,口渴的老鼠使用有关最近行动和行动-结果历史的信息来从两个概率输送水的端口中进行选择。在这里,我们全面地对该任务中的选择行为进行建模,包括即端口选择的逐次变化,即动作切换行为。我们发现,老鼠的行为有时是确定性的,而有时则是明显的随机性。这种行为偏离了在隐藏马尔可夫模型(HMM)中执行贝叶斯推理的理论最优代理的行为。我们基于逻辑回归、强化学习和粘性贝叶斯推理制定了一组模型,我们证明这些模型在数学上是等效的,并准确地描述了老鼠的行为。任务中老鼠的切换行为在每个模型中都通过随机动作策略、动作值的历史依赖性表示以及尽管有新证据但仍倾向于重复动作来捕获。这些模型通过改变粘性参数来简洁地捕获不同环境条件下的行为,并且与老鼠一样,它们可以实现接近最大的奖励率。这些结果表明,老鼠的行为通过减少动作切换达到了接近最大的性能,并且可以用一组具有少量相对固定参数的等效模型来描述。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d372/9169659/57ec75e6b374/pnas.2113961119fig01.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验