Sorbonne Université, CNRS, Institut des Systèmes Intelligents et de Robotique (ISIR), 75005, Paris, France.
Sorbonne Université, INSERM, CNRS, Neuroscience Paris Seine - Institut de Biologie Paris Seine (NPS - IBPS), 75005, Paris, France.
Commun Biol. 2020 Jan 21;3(1):34. doi: 10.1038/s42003-020-0759-x.
Can decisions be made solely by chance? Can variability be intrinsic to the decision-maker or is it inherited from environmental conditions? To investigate these questions, we designed a deterministic setting in which mice are rewarded for non-repetitive choice sequences, and modeled the experiment using reinforcement learning. We found that mice progressively increased their choice variability. Although an optimal strategy based on sequences learning was theoretically possible and would be more rewarding, animals used a pseudo-random selection which ensures high success rate. This was not the case if the animal is exposed to a uniform probabilistic reward delivery. We also show that mice were blind to changes in the temporal structure of reward delivery once they learned to choose at random. Overall, our results demonstrate that a decision-making process can self-generate variability and randomness, even when the rules governing reward delivery are neither stochastic nor volatile.
决策能否仅凭偶然做出?决策的可变性是内在的,还是由环境条件决定的?为了探究这些问题,我们设计了一个确定性环境,在此环境中,老鼠因非重复的选择序列而获得奖励,并使用强化学习对实验进行建模。我们发现,老鼠的选择变异性逐渐增加。尽管基于序列学习的最优策略在理论上是可能的,并且会带来更高的回报,但动物采用了一种伪随机选择,以确保高成功率。如果动物暴露于均匀概率的奖励分配中,则不会出现这种情况。我们还表明,一旦老鼠学会随机选择,它们就会对奖励分配的时间结构变化视而不见。总的来说,我们的结果表明,即使奖励分配的规则既不是随机的也不是不稳定的,决策过程也可以自我产生可变性和随机性。