Suppr超能文献

确定频率和语境多样性在多词表达式词汇组织中的重要性。

Determining the importance of frequency and contextual diversity in the lexical organization of multiword expressions.

作者信息

Senaldi Marco S G, Titone Debra A, Johns Brendan T

机构信息

Department of Psychology.

出版信息

Can J Exp Psychol. 2022 Jun;76(2):87-98. doi: 10.1037/cep0000271. Epub 2022 Feb 10.

Abstract

Corpus-based models of lexical strength have called into question the role of word frequency as an organizing principle of the lexicon, revealing that contextual and semantic diversity measures provide a closer fit to lexical behavior data (Adelman et al., 2006; Jones et al., 2012). Contextual diversity measures modify word frequency by ignoring word repetition in context, while semantic diversity measures consider the semantic consistency of contextual word occurrence. Recent research has shown that a better account of lexical organization data is provided by socially based measures of semantic diversity, which encode the communication patterns of individuals across discourses (Johns, 2021b). While most research on contextual diversity has focused on single words, recent corpus-based and experimental evidence suggests that an integral part of language use involves recurrent and more structurally complex units, such as multiword phrases and idioms. The aim of the present work was to determine if contextual and semantic diversity drive lexical organization at the level of multiword units (here, operationalized as idiomatic expressions), in addition to single words. To this end, we analyzed normative ratings of familiarity for 210 English idioms (Libben & Titone, 2008) using a set of contextual, semantic, and socially based diversity measures that were computed from a 55-billion word corpus of Reddit comments. The results confirm the superiority of diversity measures over frequency for multiword expressions, suggesting that multiword units, such as idiomatic phrases, show similar lexical organization dynamics as single words. (PsycInfo Database Record (c) 2022 APA, all rights reserved).

摘要

基于语料库的词汇强度模型对词频作为词汇组织原则的作用提出了质疑,结果表明语境和语义多样性度量与词汇行为数据的拟合度更高(阿德尔曼等人,2006年;琼斯等人,2012年)。语境多样性度量通过忽略语境中的词汇重复来修正词频,而语义多样性度量则考虑语境中词汇出现的语义一致性。最近的研究表明,基于社会的语义多样性度量能更好地解释词汇组织数据,这种度量编码了个体在不同语篇中的交流模式(约翰斯,2021b)。虽然大多数关于语境多样性的研究都集中在单个单词上,但最近基于语料库和实验的证据表明,语言使用的一个重要部分涉及反复出现且结构更复杂的单位,如多词短语和习语。本研究的目的是确定除了单个单词之外,语境和语义多样性是否在多词单位(这里具体化为习语表达)层面驱动词汇组织。为此,我们使用从一个550亿单词的Reddit评论语料库中计算得出的一组语境、语义和基于社会的多样性度量,分析了210个英语习语的熟悉度规范评级(利本和蒂托内,2008年)。结果证实了多样性度量在多词表达方面优于词频,这表明多词单位,如习语短语,与单个单词表现出相似的词汇组织动态。(《心理学文摘数据库记录》(c)2022美国心理学会,保留所有权利)

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验