探索分布词向量中编码的内容：一种受神经生物学启发的分析。

Exploring What Is Encoded in Distributional Word Vectors: A Neurobiologically Motivated Analysis.

机构信息

Department of Informatics & Artificial Intelligence eXploration Research Center, The University of Electro-Communications.

出版信息

Cogn Sci. 2020 Jun;44(6):e12844. doi: 10.1111/cogs.12844.

DOI:10.1111/cogs.12844

PMID:32458523

Abstract

The pervasive use of distributional semantic models or word embeddings for both cognitive modeling and practical application is because of their remarkable ability to represent the meanings of words. However, relatively little effort has been made to explore what types of information are encoded in distributional word vectors. Knowing the internal knowledge embedded in word vectors is important for cognitive modeling using distributional semantic models. Therefore, in this paper, we attempt to identify the knowledge encoded in word vectors by conducting a computational experiment using Binder et al.'s (2016) featural conceptual representations based on neurobiologically motivated attributes. In an experiment, these conceptual vectors are predicted from text-based word vectors using a neural network and linear transformation, and prediction performance is compared among various types of information. The analysis demonstrates that abstract information is generally predicted more accurately by word vectors than perceptual and spatiotemporal information, and specifically, the prediction accuracy of cognitive and social information is higher. Emotional information is also found to be successfully predicted for abstract words. These results indicate that language can be a major source of knowledge about abstract attributes, and they support the recent view that emphasizes the importance of language for abstract concepts. Furthermore, we show that word vectors can capture some types of perceptual and spatiotemporal information about concrete concepts and some relevant word categories. This suggests that language statistics can encode more perceptual knowledge than often expected.

摘要

分布语义模型或词嵌入在认知建模和实际应用中被广泛使用，是因为它们具有出色的表示单词含义的能力。然而，对于分布词向量中编码了哪些类型的信息，人们的研究相对较少。了解词向量中嵌入的内部知识对于使用分布语义模型进行认知建模非常重要。因此，在本文中，我们尝试通过使用基于神经生物学动机属性的 Binder 等人（2016 年）的特征概念表示来进行计算实验，以识别词向量中编码的知识。在实验中，我们使用神经网络和线性变换从基于文本的词向量中预测这些概念向量，并比较各种类型信息的预测性能。分析表明，与感知和时空信息相比，词向量通常可以更准确地预测抽象信息，特别是认知和社会信息的预测准确性更高。此外，还发现情感信息可以成功地预测抽象词。这些结果表明语言可以成为抽象属性知识的主要来源，并且支持了最近强调语言对抽象概念重要性的观点。此外，我们还表明词向量可以捕捉到一些具体概念和相关词类的感知和时空信息。这表明语言统计信息可以编码比通常预期更多的感知知识。

相似文献

Exploring What Is Encoded in Distributional Word Vectors: A Neurobiologically Motivated Analysis.

Cogn Sci. 2020 Jun;44(6):e12844. doi: 10.1111/cogs.12844.

A test of indirect grounding of abstract concepts using multimodal distributional semantics.

Front Psychol. 2022 Oct 4;13:906181. doi: 10.3389/fpsyg.2022.906181. eCollection 2022.

Distinct fronto-temporal substrates of distributional and taxonomic similarity among words: evidence from RSA of BOLD signals.

Neuroimage. 2021 Jan 1;224:117408. doi: 10.1016/j.neuroimage.2020.117408. Epub 2020 Oct 10.

Visual and Affective Multimodal Models of Word Meaning in Language and Mind.

Cogn Sci. 2021 Jan;45(1):e12922. doi: 10.1111/cogs.12922.

Biomedical Text Classification Using Augmented Word Representation Based on Distributional and Relational Contexts.

Comput Intell Neurosci. 2023 Feb 15;2023:2989791. doi: 10.1155/2023/2989791. eCollection 2023.

Heteromodal Cortical Areas Encode Sensory-Motor Features of Word Meaning.

J Neurosci. 2016 Sep 21;36(38):9763-9. doi: 10.1523/JNEUROSCI.4095-15.2016.

Semantic Memory Search and Retrieval in a Novel Cooperative Word Game: A Comparison of Associative and Distributional Semantic Models.

Cogn Sci. 2021 Oct;45(10):e13053. doi: 10.1111/cogs.13053.

Can prediction-based distributional semantic models predict typicality?

Q J Exp Psychol (Hove). 2019 Aug;72(8):2084-2109. doi: 10.1177/1747021819830949. Epub 2019 Feb 21.

Probing Lexical Ambiguity: Word Vectors Encode Number and Relatedness of Senses.

Cogn Sci. 2021 May;45(5):e12943. doi: 10.1111/cogs.12943.

Redundancy in perceptual and linguistic experience: comparing feature-based and distributional models of semantic representation.

Top Cogn Sci. 2011 Apr;3(2):303-45. doi: 10.1111/j.1756-8765.2010.01111.x. Epub 2010 Aug 19.

引用本文的文献

Leveraging Context for Perceptual Prediction Using Word Embeddings.

Cogn Sci. 2025 Jun;49(6):e70072. doi: 10.1111/cogs.70072.

Family lexicon: Using language models to encode memories of personally familiar and famous people and places in the brain.

PLoS One. 2024 Nov 22;19(11):e0291099. doi: 10.1371/journal.pone.0291099. eCollection 2024.

Analyzing political party positions through multi-language twitter text embeddings.

Front Big Data. 2024 May 30;7:1330392. doi: 10.3389/fdata.2024.1330392. eCollection 2024.

Evaluation of co-speech gestures grounded in word-distributed representation.

Front Robot AI. 2024 Apr 25;11:1362463. doi: 10.3389/frobt.2024.1362463. eCollection 2024.

Can large language models help augment English psycholinguistic datasets?

Behav Res Methods. 2024 Sep;56(6):6082-6100. doi: 10.3758/s13428-024-02337-z. Epub 2024 Jan 23.

A test of indirect grounding of abstract concepts using multimodal distributional semantics.

Front Psychol. 2022 Oct 4;13:906181. doi: 10.3389/fpsyg.2022.906181. eCollection 2022.

Semantic projection recovers rich human knowledge of multiple object features from word embeddings.

Nat Hum Behav. 2022 Jul;6(7):975-987. doi: 10.1038/s41562-022-01316-8. Epub 2022 Apr 14.

Modelling brain representations of abstract concepts.

PLoS Comput Biol. 2022 Feb 4;18(2):e1009837. doi: 10.1371/journal.pcbi.1009837. eCollection 2022 Feb.

Images of the unseen: extrapolating visual representations for abstract and concrete words in a data-driven computational model.

Psychol Res. 2022 Nov;86(8):2512-2532. doi: 10.1007/s00426-020-01429-7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

探索分布词向量中编码的内容：一种受神经生物学启发的分析。

Exploring What Is Encoded in Distributional Word Vectors: A Neurobiologically Motivated Analysis.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献