Suppr超能文献

一种用于理解拖延行为的强化学习方法:价值近似不准确会导致任务的非理性延迟吗?

A Reinforcement Learning Approach to Understanding Procrastination: Does Inaccurate Value Approximation Cause Irrational Postponing of a Task?

作者信息

Feng Zheyu, Nagase Asako Mitsuto, Morita Kenji

机构信息

Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo, Japan.

Division of Neurology, Department of Brain and Neurosciences, Faculty of Medicine, Tottori University, Yonago, Japan.

出版信息

Front Neurosci. 2021 Sep 16;15:660595. doi: 10.3389/fnins.2021.660595. eCollection 2021.

Abstract

Procrastination is the voluntary but irrational postponing of a task despite being aware that the delay can lead to worse consequences. It has been extensively studied in psychological field, from contributing factors, to theoretical models. From value-based decision making and reinforcement learning (RL) perspective, procrastination has been suggested to be caused by non-optimal choice resulting from cognitive limitations. Exactly what sort of cognitive limitations are involved, however, remains elusive. In the current study, we examined if a particular type of cognitive limitation, namely, inaccurate valuation resulting from inadequate state representation, would cause procrastination. Recent work has suggested that humans may adopt a particular type of state representation called the successor representation (SR) and that humans can learn to represent states by relatively low-dimensional features. Combining these suggestions, we assumed a dimension-reduced version of SR. We modeled a series of behaviors of a "student" doing assignments during the school term, when putting off doing the assignments (i.e., procrastination) is not allowed, and during the vacation, when whether to procrastinate or not can be freely chosen. We assumed that the "student" had acquired a rigid reduced SR of each state, corresponding to each step in completing an assignment, under the policy without procrastination. The "student" learned the approximated value of each state which was computed as a linear function of features of the states in the rigid reduced SR, through temporal-difference (TD) learning. During the vacation, the "student" made decisions at each time-step whether to procrastinate based on these approximated values. Simulation results showed that the reduced SR-based RL model generated procrastination behavior, which worsened across episodes. According to the values approximated by the "student," to procrastinate was the better choice, whereas not to procrastinate was mostly better according to the true values. Thus, the current model generated procrastination behavior caused by inaccurate value approximation, which resulted from the adoption of the reduced SR as state representation. These findings indicate that the reduced SR, or more generally, the dimension reduction in state representation, can be a potential form of cognitive limitation that leads to procrastination.

摘要

拖延是指尽管意识到拖延可能会导致更糟糕的后果,但仍自愿且非理性地推迟任务。它在心理学领域得到了广泛研究,涵盖了从影响因素到理论模型等方面。从基于价值的决策和强化学习(RL)的角度来看,拖延被认为是由认知局限导致的非最优选择所引起的。然而,究竟涉及何种认知局限仍然难以捉摸。在当前的研究中,我们考察了一种特定类型的认知局限,即由于状态表征不足导致的估值不准确,是否会引发拖延。最近的研究表明,人类可能会采用一种称为后继表征(SR)的特定类型的状态表征,并且人类可以通过相对低维的特征来学习表征状态。结合这些观点,我们假设了一个维度缩减版的SR。我们对一名“学生”在学期期间做作业的一系列行为进行了建模,此时不允许推迟做作业(即拖延),以及在假期期间,此时是否拖延可以自由选择。我们假设“学生”在不拖延的策略下,已经获得了与完成作业的每个步骤相对应的每个状态的刚性缩减SR。“学生”通过时间差分(TD)学习,学习了每个状态的近似值,该近似值被计算为刚性缩减SR中状态特征的线性函数。在假期期间,“学生”在每个时间步根据这些近似值决定是否拖延。模拟结果表明,基于缩减SR的RL模型产生了拖延行为,并且这种行为在各轮中逐渐恶化。根据“学生”近似的值,拖延是更好的选择,而根据真实值,不拖延大多时候更好。因此,当前模型产生了由不准确的价值近似导致的拖延行为,这种不准确的价值近似是由于采用缩减SR作为状态表征而产生的。这些发现表明,缩减SR,或者更一般地说,状态表征的维度缩减,可能是导致拖延的一种潜在认知局限形式。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f047/8481628/017381647552/fnins-15-660595-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验