Gershman Samuel J, Lai Lucy
Department of Psychology and Center for Brain Science, Harvard University, US.
Center for Brains, Minds and Machines, MIT, US.
Comput Psychiatr. 2021 May 25;5(1):38-53. doi: 10.5334/cpsy.71. eCollection 2021.
Action selection requires a policy that maps states of the world to a distribution over actions. The amount of memory needed to specify the policy (the policy complexity) increases with the state-dependence of the policy. If there is a capacity limit for policy complexity, then there will also be a trade-off between reward and complexity, since some reward will need to be sacrificed in order to satisfy the capacity constraint. This paper empirically characterizes the trade-off between reward and complexity for both schizophrenia patients and healthy controls. Schizophrenia patients adopt lower complexity policies on average, and these policies are more strongly biased away from the optimal reward-complexity trade-off curve compared to healthy controls. However, healthy controls are also biased away from the optimal trade-off curve, and both groups appear to lie on the same empirical trade-off curve. We explain these findings using a cost-sensitive actor-critic model. Our empirical and theoretical results shed new light on cognitive effort abnormalities in schizophrenia.
动作选择需要一种将世界状态映射到动作分布的策略。指定该策略所需的内存量(策略复杂度)会随着策略的状态依赖性而增加。如果策略复杂度存在容量限制,那么在奖励和复杂度之间也会存在权衡,因为为了满足容量约束需要牺牲一些奖励。本文通过实证研究了精神分裂症患者和健康对照在奖励与复杂度之间的权衡。平均而言,精神分裂症患者采用复杂度较低的策略,与健康对照相比,这些策略更明显地偏离了最优奖励 - 复杂度权衡曲线。然而,健康对照也偏离了最优权衡曲线,并且两组似乎都位于同一条经验权衡曲线上。我们使用一种成本敏感型行为 - 评判模型来解释这些发现。我们的实证和理论结果为精神分裂症的认知努力异常提供了新的见解。