Department of Psychology, Stanford University, United States.
Department of Psychology, Stanford University, United States.
Cognition. 2020 Jun;199:104092. doi: 10.1016/j.cognition.2019.104092. Epub 2020 Mar 2.
Identifying a spoken word in a referential context requires both the ability to integrate multimodal input and the ability to reason under uncertainty. How do these tasks interact with one another? We study how adults identify novel words under joint uncertainty in the auditory and visual modalities, and we propose an ideal observer model of how cues in these modalities are combined optimally. Model predictions are tested in four experiments where recognition is made under various sources of uncertainty. We found that participants use both auditory and visual cues to recognize novel words. When the signal is not distorted with environmental noise, participants weight the auditory and visual cues optimally, that is, according to the relative reliability of each modality. In contrast, when one modality has noise added to it, human perceivers systematically prefer the unperturbed modality to a greater extent than the optimal model does. This work extends the literature on perceptual cue combination to the case of word recognition in a referential context. In addition, this context offers a link to the study of multimodal information in word meaning learning.
在参考语境中识别口语需要具备整合多模态输入的能力和在不确定条件下进行推理的能力。这两个任务如何相互作用?我们研究了成年人如何在听觉和视觉模态的联合不确定性下识别新单词,并提出了一个理想观察者模型,说明如何最优地组合这些模态中的线索。通过四个实验来检验模型预测,在这些实验中,识别是在各种不确定性来源下进行的。我们发现参与者使用听觉和视觉线索来识别新单词。当信号没有受到环境噪声干扰时,参与者会根据每个模态的相对可靠性,最优地权衡听觉和视觉线索。相比之下,当一个模态中加入噪声时,人类感知者会比最优模型更系统地优先选择未受干扰的模态。这项工作将感知线索组合的文献扩展到参考语境中的单词识别案例。此外,这种语境为多模态信息在单词意义学习中的研究提供了联系。