McCoy R Thomas, Griffiths Thomas L
Department of Linguistics, Yale University, 370 Temple St, New Haven, CT, 06511, USA.
Wu Tsai Institute, Yale University, 100 College St, New Haven, CT, 06510, USA.
Nat Commun. 2025 May 20;16(1):4676. doi: 10.1038/s41467-025-59957-y.
Humans can learn languages from remarkably little experience. Developing computational models that explain this ability has been a major challenge in cognitive science. Existing approaches have been successful at explaining how humans generalize rapidly in controlled settings but are usually too restrictive to tractably handle naturalistic data. We show that learning from limited naturalistic data is possible with an approach that bridges the divide between two popular modeling traditions: Bayesian models and neural networks. This approach distills a Bayesian model's inductive biases-the factors that guide generalization-into a neural network that has flexible representations. Like a Bayesian model, the resulting system can learn formal linguistic patterns from limited data. Like a neural network, it can also learn aspects of English syntax from naturally-occurring sentences. Thus, this model provides a single system that can learn rapidly and can handle naturalistic data.
人类仅需极少的经验就能学习语言。开发能够解释这种能力的计算模型一直是认知科学中的一项重大挑战。现有方法在解释人类如何在受控环境中快速泛化方面取得了成功,但通常限制过多,难以处理自然主义数据。我们表明,通过一种弥合两种流行建模传统(贝叶斯模型和神经网络)之间差距的方法,可以从有限的自然主义数据中进行学习。这种方法将贝叶斯模型的归纳偏差(即指导泛化的因素)提炼成具有灵活表示的神经网络。与贝叶斯模型一样,由此产生的系统可以从有限的数据中学习形式语言模式。与神经网络一样,它还可以从自然出现的句子中学习英语句法的各个方面。因此,该模型提供了一个能够快速学习并处理自然主义数据的单一系统。