Experimental Psychology Department, University College London.
Cogn Sci. 2020 Apr;44(4):e12830. doi: 10.1111/cogs.12830.
A number of recent models of semantics combine linguistic information, derived from text corpora, and visual information, derived from image collections, demonstrating that the resulting multimodal models are better than either of their unimodal counterparts, in accounting for behavioral data. Empirical work on semantic processing has shown that emotion also plays an important role especially in abstract concepts; however, models integrating emotion along with linguistic and visual information are lacking. Here, we first improve on visual and affective representations, derived from state-of-the-art existing models, by choosing models that best fit available human semantic data and extending the number of concepts they cover. Crucially then, we assess whether adding affective representations (obtained from a neural network model designed to predict emojis from co-occurring text) improves the model's ability to fit semantic similarity/relatedness judgments from a purely linguistic and linguistic-visual model. We find that, given specific weights assigned to the models, adding both visual and affective representations improves performance, with visual representations providing an improvement especially for more concrete words, and affective representations improving especially the fit for more abstract words.
近期出现的一些语义模型结合了语言学信息(源自文本语料库)和视觉信息(源自图像集),研究表明,与单一模态模型相比,这些多模态模型在解释行为数据方面更具优势。关于语义处理的实证研究表明,情绪在抽象概念中也起着重要作用;然而,缺乏将情绪与语言和视觉信息相结合的模型。在这里,我们首先通过选择最适合现有人类语义数据的模型,并扩展其涵盖的概念数量,改进了来自最先进现有模型的视觉和情感表示。至关重要的是,我们评估了从神经网络模型中获取的情感表示(该模型旨在根据文本的共同出现来预测表情符号)是否可以提高模型从纯语言和语言-视觉模型中拟合语义相似性/相关性判断的能力。我们发现,给定模型的特定权重,添加视觉和情感表示均可提高性能,其中视觉表示尤其可以提高更具体单词的性能,而情感表示则可以提高更抽象单词的拟合度。