Russin Jacob, Pavlick Ellie, Frank Michael J
Department of Computer Science, Department of Cognitive and Psychological Sciences, Brown University.
Department of Computer Science, Brown University.
ArXiv. 2025 Apr 25:arXiv:2402.08674v4.
Human learning embodies a striking duality: sometimes, we appear capable of following logical, compositional rules and benefit from structured curricula (e.g., in formal education), while other times, we rely on an incremental approach or trial-and-error, learning better from curricula that are randomly interleaved. Influential psychological theories explain this seemingly disparate behavioral evidence by positing two qualitatively different learning systems-one for rapid, rule-based inferences and another for slow, incremental adaptation. It remains unclear how to reconcile such theories with neural networks, which learn via incremental weight updates and are thus a natural model for the latter type of learning, but are not obviously compatible with the former. However, recent evidence suggests that metalearning neural networks and large language models are capable of "in-context learning" (ICL)-the ability to flexibly grasp the structure of a new task from a few examples. Here, we show that the dynamic interplay between ICL and default in-weight learning (IWL) naturally captures a broad range of learning phenomena observed in humans, reproducing curriculum effects on category-learning and compositional tasks, and recapitulating a tradeoff between flexibility and retention. Our work shows how emergent ICL can equip neural networks with fundamentally different learning properties that can coexist with their native IWL, thus offering a novel perspective on dual-process theories and human cognitive flexibility.
有时,我们似乎能够遵循逻辑的、组合性的规则,并从结构化课程(如正规教育中)受益;而在其他时候,我们则依赖渐进式方法或试错法,从随机交错的课程中能学得更好。有影响力的心理学理论通过假定两种质的不同的学习系统来解释这种看似不同的行为证据——一种用于快速的、基于规则的推理,另一种用于缓慢的、渐进式的适应。目前尚不清楚如何将这些理论与神经网络协调起来,神经网络通过渐进式权重更新进行学习,因此是后一种学习类型的自然模型,但与前一种模型显然不兼容。然而,最近的证据表明,元学习神经网络和大语言模型能够进行“上下文学习”(ICL)——即从几个例子中灵活掌握新任务结构的能力。在这里,我们表明ICL与默认的权重内学习(IWL)之间的动态相互作用自然地捕捉了在人类中观察到的广泛学习现象,再现了课程对类别学习和组合任务的影响,并概括了灵活性与记忆之间的权衡。我们的工作展示了新兴的ICL如何使神经网络具备与它们原生的IWL共存的根本不同的学习特性,从而为双过程理论和人类认知灵活性提供了一个新视角。