Suppr超能文献

扫描模式可预测视觉场景跨模态处理中的句子生成。

Scan patterns predict sentence production in the cross-modal processing of visual scenes.

机构信息

Institute for Language, Cognition and Computation, School of Informatics, University of Edinburgh.

出版信息

Cogn Sci. 2012 Sep-Oct;36(7):1204-23. doi: 10.1111/j.1551-6709.2012.01246.x. Epub 2012 Apr 9.

Abstract

Most everyday tasks involve multiple modalities, which raises the question of how the processing of these modalities is coordinated by the cognitive system. In this paper, we focus on the coordination of visual attention and linguistic processing during speaking. Previous research has shown that objects in a visual scene are fixated before they are mentioned, leading us to hypothesize that the scan pattern of a participant can be used to predict what he or she will say. We test this hypothesis using a data set of cued scene descriptions of photo-realistic scenes. We demonstrate that similar scan patterns are correlated with similar sentences, within and between visual scenes; and that this correlation holds for three phases of the language production process (target identification, sentence planning, and speaking). We also present a simple algorithm that uses scan patterns to accurately predict associated sentences by utilizing similarity-based retrieval.

摘要

大多数日常任务都涉及多种模态,这就提出了一个问题,即认知系统如何协调这些模态的处理。在本文中,我们专注于口语过程中视觉注意和语言处理的协调。先前的研究表明,在视觉场景中的物体被提及之前就已经被注视,这使我们假设参与者的扫描模式可以用来预测他或她会说什么。我们使用一组提示的照片般逼真场景的场景描述数据集来检验这个假设。我们证明,在视觉场景内部和之间,相似的扫描模式与相似的句子相关;并且这种相关性适用于语言产生过程的三个阶段(目标识别、句子规划和说话)。我们还提出了一种简单的算法,该算法通过利用基于相似性的检索,使用扫描模式来准确预测相关句子。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验