青少年时期强化学习的计算发展

The Computational Development of Reinforcement Learning during Adolescence.

作者信息

Palminteri Stefano, Kilford Emma J, Coricelli Giorgio, Blakemore Sarah-Jayne

机构信息

Institute of Cognitive Neuroscience, University College London, London, United Kingdom.

Laboratoire de Neurosciences Cognitive, École Normale Supérieure, Paris, France.

出版信息

PLoS Comput Biol. 2016 Jun 20;12(6):e1004953. doi: 10.1371/journal.pcbi.1004953. eCollection 2016 Jun.

DOI:10.1371/journal.pcbi.1004953

PMID:27322574

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4920542/

Abstract

Adolescence is a period of life characterised by changes in learning and decision-making. Learning and decision-making do not rely on a unitary system, but instead require the coordination of different cognitive processes that can be mathematically formalised as dissociable computational modules. Here, we aimed to trace the developmental time-course of the computational modules responsible for learning from reward or punishment, and learning from counterfactual feedback. Adolescents and adults carried out a novel reinforcement learning paradigm in which participants learned the association between cues and probabilistic outcomes, where the outcomes differed in valence (reward versus punishment) and feedback was either partial or complete (either the outcome of the chosen option only, or the outcomes of both the chosen and unchosen option, were displayed). Computational strategies changed during development: whereas adolescents' behaviour was better explained by a basic reinforcement learning algorithm, adults' behaviour integrated increasingly complex computational features, namely a counterfactual learning module (enabling enhanced performance in the presence of complete feedback) and a value contextualisation module (enabling symmetrical reward and punishment learning). Unlike adults, adolescent performance did not benefit from counterfactual (complete) feedback. In addition, while adults learned symmetrically from both reward and punishment, adolescents learned from reward but were less likely to learn from punishment. This tendency to rely on rewards and not to consider alternative consequences of actions might contribute to our understanding of decision-making in adolescence.

摘要

青春期是一个以学习和决策变化为特征的生命阶段。学习和决策并非依赖单一系统，而是需要不同认知过程的协调，这些认知过程在数学上可形式化为可分离的计算模块。在此，我们旨在追踪负责从奖励或惩罚中学习以及从反事实反馈中学习的计算模块的发展时间进程。青少年和成年人进行了一种新颖的强化学习范式，参与者在其中学习线索与概率结果之间的关联，其中结果在效价上有所不同（奖励与惩罚），并且反馈要么是部分的，要么是完整的（要么仅显示所选选项的结果，要么显示所选和未选选项的结果）。计算策略在发展过程中发生了变化：虽然青少年的行为通过基本强化学习算法能得到更好的解释，但成年人的行为整合了越来越复杂的计算特征，即一个反事实学习模块（在存在完整反馈时能提高表现）和一个价值情境化模块（实现对称的奖励和惩罚学习）。与成年人不同，青少年的表现并未从反事实（完整）反馈中受益。此外，虽然成年人从奖励和惩罚中进行对称学习，但青少年从奖励中学习，而从惩罚中学习的可能性较小。这种依赖奖励而不考虑行动的其他后果的倾向可能有助于我们理解青春期的决策。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0b5/4920542/99344aa17a8f/pcbi.1004953.g001.jpg

相似文献

The Computational Development of Reinforcement Learning during Adolescence.

PLoS Comput Biol. 2016 Jun 20;12(6):e1004953. doi: 10.1371/journal.pcbi.1004953. eCollection 2016 Jun.

Distinct linear and non-linear trajectories of reward and punishment reversal learning during development: relevance for dopamine's role in adolescent decision making.

Dev Cogn Neurosci. 2011 Oct;1(4):578-90. doi: 10.1016/j.dcn.2011.06.007. Epub 2011 Jun 25.

How we learn to make decisions: rapid propagation of reinforcement learning prediction errors in humans.

J Cogn Neurosci. 2014 Mar;26(3):635-44. doi: 10.1162/jocn_a_00509. Epub 2013 Oct 29.

Adolescents adapt more slowly than adults to varying reward contingencies.

J Cogn Neurosci. 2014 Dec;26(12):2670-2681. doi: 10.1162/jocn_a_00677. Epub 2014 Jun 24.

Confirmation bias in human reinforcement learning: Evidence from counterfactual feedback processing.

PLoS Comput Biol. 2017 Aug 11;13(8):e1005684. doi: 10.1371/journal.pcbi.1005684. eCollection 2017 Aug.

Reward and avoidance learning in the context of aversive environments and possible implications for depressive symptoms.

Psychopharmacology (Berl). 2019 Aug;236(8):2437-2449. doi: 10.1007/s00213-019-05299-9. Epub 2019 Jun 28.

Cognitive flexibility in adolescence: neural and behavioral mechanisms of reward prediction error processing in adaptive decision making during development.

Neuroimage. 2015 Jan 1;104:347-54. doi: 10.1016/j.neuroimage.2014.09.018. Epub 2014 Sep 16.

Modulation of value-based decision making behavior by subregions of the rat prefrontal cortex.

Psychopharmacology (Berl). 2020 May;237(5):1267-1280. doi: 10.1007/s00213-020-05454-7. Epub 2020 Feb 6.

An Upside to Reward Sensitivity: The Hippocampus Supports Enhanced Reinforcement Learning in Adolescence.

Neuron. 2016 Oct 5;92(1):93-99. doi: 10.1016/j.neuron.2016.08.031.

How pupil responses track value-based decision-making during and after reinforcement learning.

PLoS Comput Biol. 2018 Nov 30;14(11):e1006632. doi: 10.1371/journal.pcbi.1006632. eCollection 2018 Nov.

引用本文的文献

Nucleus accumbens dopamine release reflects Bayesian inference during instrumental learning.

PLoS Comput Biol. 2025 Jul 2;21(7):e1013226. doi: 10.1371/journal.pcbi.1013226. eCollection 2025 Jul.

Navigating a varying reward environment in childhood and adolescence.

Sci Rep. 2025 Jul 2;15(1):22715. doi: 10.1038/s41598-025-05725-3.

Reversal learning is influenced by cognitive flexibility and develops throughout early adolescence.

NPJ Sci Learn. 2025 May 12;10(1):27. doi: 10.1038/s41539-025-00308-3.

Electrical brain activations in preadolescents during a probabilistic reward-learning task reflect cognitive processes and behavior strategies.

Front Hum Neurosci. 2025 Jan 30;19:1460584. doi: 10.3389/fnhum.2025.1460584. eCollection 2025.

Interpretation of individual differences in computational neuroscience using a latent input approach.

Dev Cogn Neurosci. 2025 Apr;72:101512. doi: 10.1016/j.dcn.2025.101512. Epub 2025 Jan 16.

The preference for surprise in reinforcement learning underlies the differences in developmental changes in risk preference between autistic and neurotypical youth.

Mol Autism. 2025 Jan 16;16(1):3. doi: 10.1186/s13229-025-00637-5.

The connecting brain in context: How adolescent plasticity supports learning and development.

Dev Cogn Neurosci. 2025 Jan;71:101486. doi: 10.1016/j.dcn.2024.101486. Epub 2024 Nov 28.

Decrease in decision noise from adolescence into adulthood mediates an increase in more sophisticated choice behaviors and performance gain.

PLoS Biol. 2024 Nov 14;22(11):e3002877. doi: 10.1371/journal.pbio.3002877. eCollection 2024 Nov.

Functional connectivity between the nucleus accumbens and amygdala underlies avoidance learning during adolescence: Implications for developmental psychopathology.

Dev Psychopathol. 2024 Sep 26:1-13. doi: 10.1017/S095457942400141X.

Compulsive avoidance in youths and adults with OCD: an aversive pavlovian-to-instrumental transfer study.

Transl Psychiatry. 2024 Jul 26;14(1):308. doi: 10.1038/s41398-024-03028-1.

本文引用的文献

Developing developmental cognitive neuroscience: From agenda setting to hypothesis testing.

Dev Cogn Neurosci. 2016 Feb;17:138-44. doi: 10.1016/j.dcn.2015.12.011. Epub 2015 Dec 23.

The dual systems model: Review, reappraisal, and reaffirmation.

Dev Cogn Neurosci. 2016 Feb;17:103-17. doi: 10.1016/j.dcn.2015.12.010. Epub 2015 Dec 29.

Beyond simple models of adolescence to an integrated circuit-based account: A commentary.

Dev Cogn Neurosci. 2016 Feb;17:128-30. doi: 10.1016/j.dcn.2015.12.006. Epub 2015 Dec 17.

Contextual modulation of value signals in reward and punishment learning.

Nat Commun. 2015 Aug 25;6:8096. doi: 10.1038/ncomms9096.

Computational psychiatry.

Neuron. 2014 Nov 5;84(3):638-54. doi: 10.1016/j.neuron.2014.10.018.

An evolutionary computational theory of prefrontal executive function in decision-making.

Philos Trans R Soc Lond B Biol Sci. 2014 Nov 5;369(1655). doi: 10.1098/rstb.2013.0474.

Beyond simple models of self-control to circuit-based accounts of adolescent behavior.

Annu Rev Psychol. 2015 Jan 3;66:295-319. doi: 10.1146/annurev-psych-010814-015156. Epub 2014 Aug 4.

Anterior cingulate engagement in a foraging context reflects choice difficulty, not foraging value.

Nat Neurosci. 2014 Sep;17(9):1249-54. doi: 10.1038/nn.3771. Epub 2014 Jul 27.

The developmental mismatch in structural brain maturation during adolescence.

Dev Neurosci. 2014;36(3-4):147-60. doi: 10.1159/000362328. Epub 2014 Jun 27.

Adolescents adapt more slowly than adults to varying reward contingencies.

J Cogn Neurosci. 2014 Dec;26(12):2670-2681. doi: 10.1162/jocn_a_00677. Epub 2014 Jun 24.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

青少年时期强化学习的计算发展

The Computational Development of Reinforcement Learning during Adolescence.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献