Max Planck Institute for the Science of Human History, Kahlaische Str. 10, 07745, Jena, Germany.
Max Planck Institute for Evolutionary Anthropology, Deutscher Platz 6, 04103, Leipzig, Germany.
Behav Res Methods. 2022 Apr;54(2):864-884. doi: 10.3758/s13428-021-01650-1. Epub 2021 Aug 6.
Psychologists and linguists collect various data on word and concept properties. In psychology, scholars have accumulated norms and ratings for a large number of words in languages with many speakers. In linguistics, scholars have accumulated cross-linguistic information about the relations between words and concepts. Until now, however, there have been no efforts to combine information from the two fields, which would allow comparison of psychological and linguistic properties across different languages. The Database of Cross-Linguistic Norms, Ratings, and Relations for Words and Concepts (NoRaRe) is the first attempt to close this gap. Building on a reference catalog that offers standardization of concepts used in historical and typological language comparison, it integrates data from psychology and linguistics, collected from 98 data sets, covering 65 unique properties for 40 languages. The database is curated with the help of manual, automated, semi-automated workflows and uses a software API to control and access the data. The database is accessible via a web application, the software API, or using scripting languages. In this study, we present how the database is structured, how it can be extended, and how we control the quality of the data curation process. To illustrate its application, we present three case studies that test the validity of our approach, the accuracy of our workflows, and the integrative potential of the database. Due to regular version updates, the NoRaRe database has the potential to advance research in psychology and linguistics by offering researchers an integrated perspective on both fields.
心理学家和语言学家收集各种关于单词和概念属性的数据。在心理学领域,学者们积累了许多语言的大量单词的规范和评级。在语言学领域,学者们积累了关于单词和概念之间关系的跨语言信息。然而,到目前为止,还没有人努力将这两个领域的信息结合起来,以便在不同的语言之间比较心理和语言属性。跨语言词汇、概念规范、评级和关系数据库(NoRaRe)是首次尝试弥合这一差距。该数据库以历史和类型学语言比较中使用的概念标准化参考目录为基础,整合了来自心理学和语言学的数据,这些数据来自 98 个数据集,涵盖了 40 种语言的 65 个独特属性。该数据库是在手动、自动化、半自动工作流程的帮助下进行管理的,并使用软件 API 来控制和访问数据。该数据库可通过网络应用程序、软件 API 或使用脚本语言访问。在本研究中,我们介绍了数据库的结构、如何扩展数据库以及如何控制数据管理过程的质量。为了说明其应用,我们展示了三个案例研究,这些研究测试了我们方法的有效性、工作流程的准确性以及数据库的综合潜力。由于定期的版本更新,NoRaRe 数据库有可能通过为研究人员提供对这两个领域的综合视角,来推动心理学和语言学的研究。