Suppr超能文献

克罗地亚心理语言学数据库:6000 个名词、动词、形容词和副词的估计值。

The Croatian psycholinguistic database: Estimates for 6000 nouns, verbs, adjectives and adverbs.

机构信息

Faculty of Humanities and Social Sciences, University of Zagreb, Ivana Lučića 3, 10000, Zagreb, Croatia.

Department of South Slavic Languages and Literatures, Zagreb, Croatia.

出版信息

Behav Res Methods. 2021 Aug;53(4):1799-1816. doi: 10.3758/s13428-020-01533-x. Epub 2021 Apr 26.

Abstract

Psycholinguistic databases containing ratings of concreteness, imageability, age of acquisition, and subjective frequency are used in psycholinguistic and neurolinguistic studies which require words as stimuli. Linguistic characteristics (e.g. word length, corpus frequency) are frequently coded, but word class is seldom systematically treated, although there are indications of its significance for imageability and concreteness. This paper presents the Croatian Psycholinguistic Database (CPD; available at: https://doi.org/10.17234/megahr.2019.hpb ), containing 6000 Croatian nouns, verbs, adjectives and adverbs, rated for concreteness, imageability, age of acquisition, and subjective frequency. Moreover, we present computationally obtained extrapolations of concreteness and imageability to the remainder of the Croatian lexicon (available at: https://github.com/megahr/lexicon/blob/master/predictions/hr_c_i.predictions.txt ). In the two studies presented here, we explore the significance of word class for concreteness and imageability in human and computationally obtained ratings. The observed correlations in the CPD indicate correspondences between psycholinguistic measures expected from the literature. Word classes exhibit differences in subjective frequency, age of acquisition, concreteness and imageability, with significant differences between nouns, verbs, adjectives and adverbs. In the computational study which focused on concreteness and imageability, concreteness obtained higher correlations with human ratings than imageability, and the system underpredicted the concreteness of nouns, and overpredicted the concreteness of adjectives and adverbs. Overall, this suggests that word class contains schematic conceptual and distributional information. Schematic conceptual content seems to be more significant in human ratings of concreteness and less significant in computationally obtained ratings, where distributional information seems to play a more significant role. This suggests that word class differences should be theoretically explored.

摘要

心理语言学数据库包含具体性、形象性、习得年龄和主观频率的评分,用于需要单词作为刺激的心理语言学和神经语言学研究。语言特征(例如单词长度、语料库频率)经常被编码,但词类很少被系统地处理,尽管有迹象表明它对形象性和具体性有重要意义。本文介绍了克罗地亚心理语言学数据库(CPD;可在以下网址获取:https://doi.org/10.17234/megahr.2019.hpb),其中包含 6000 个克罗地亚名词、动词、形容词和副词,这些词的具体性、形象性、习得年龄和主观频率都经过了评分。此外,我们还提供了通过计算方法对克罗地亚语词汇其余部分进行具体性和形象性的外推(可在以下网址获取:https://github.com/megahr/lexicon/blob/master/predictions/hr_c_i.predictions.txt)。在本文介绍的两项研究中,我们探讨了词类对人类和通过计算方法获得的具体性和形象性评分的重要性。CPD 中的观察到的相关性表明了与文献中预期的心理语言学测量之间的对应关系。词类在主观频率、习得年龄、具体性和形象性方面存在差异,名词、动词、形容词和副词之间存在显著差异。在专注于具体性和形象性的计算研究中,具体性与人类评分的相关性高于形象性,并且该系统对名词的具体性预测不足,对形容词和副词的具体性预测过高。总体而言,这表明词类包含了示意性的概念和分布信息。示意性的概念内容在人类对具体性的评分中似乎更为重要,而在通过计算方法获得的评分中则不那么重要,在这种评分中,分布信息似乎扮演着更为重要的角色。这表明词类差异应该在理论上进行探讨。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30a6/8367916/b63b1ac6d190/13428_2020_1533_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验