Institute of Cognitive Science, University of Osnabrück.
Department of Linguistics, University of Tübingen.
Cogn Sci. 2021 Jan;46(1):e13069. doi: 10.1111/cogs.13069.
One of the great challenges in word learning is that words are typically uttered in a context with many potential referents. Children's tendency to associate novel words with novel referents, which is taken to reflect a mutual exclusivity (ME) bias, forms a useful disambiguation mechanism. We study semantic learning in pragmatic agents-combining the Rational Speech Act model with gradient-based learning-and explore the conditions under which such agents show an ME bias. This approach provides a framework for investigating a pragmatic account of the ME bias in humans but also for building artificial agents that display an ME bias. A series of analyses demonstrates striking parallels between our model and human word learning regarding several aspects relevant to the ME bias phenomenon: online inference, long-term learning, and developmental effects. By testing different implementations, we find that two components, pragmatic online inference and incremental collection of evidence for one-to-one correspondences between words and referents, play an important role in modeling the developmental trajectory of the ME bias. Finally, we outline an extension of our model to a deep neural network architecture that can process more naturalistic visual and linguistic input. Until now, in contrast to children, deep neural networks have needed indirect access to (supposed to be novel) test inputs during training to display an ME bias. Our model is the first one to do so without using this manipulation.
在词汇学习中,最大的挑战之一是,单词通常是在有许多潜在指代对象的语境中被说出的。儿童将新单词与新的指代对象联系起来的倾向,被认为反映了一种排他性(ME)偏见,这种偏见形成了一种有用的消歧机制。我们研究了语用代理中的语义学习——将理性言语行为模型与基于梯度的学习相结合——并探讨了这种代理表现出 ME 偏见的条件。这种方法为研究人类 ME 偏见的语用解释提供了一个框架,也为构建表现出 ME 偏见的人工代理提供了一个框架。一系列分析表明,我们的模型与人类词汇学习之间存在惊人的相似之处,涉及到与 ME 偏见现象相关的几个方面:在线推理、长期学习和发展效应。通过测试不同的实现方式,我们发现,两个组件,语用在线推理和增量收集证据,以建立单词和指代对象之间一对一的对应关系,对于模拟 ME 偏见的发展轨迹起着重要作用。最后,我们概述了我们的模型向深度神经网络架构的扩展,该架构可以处理更自然的视觉和语言输入。到目前为止,与儿童不同的是,深度神经网络在训练期间需要间接访问(假设是新的)测试输入,才能表现出 ME 偏见。我们的模型是第一个在不使用这种操作的情况下做到这一点的模型。