Suppr超能文献

关于用英语和荷兰语出版的1145部小说中语言使用情况的心理语言学数据集。

Psycholinguistic dataset on language use in 1145 novels published in English and Dutch.

作者信息

Luoto Severi, van Cranenburgh Andreas

机构信息

English, Drama and Writing Studies, University of Auckland, 1010 Auckland, New Zealand.

School of Psychology, University of Auckland, 1010 Auckland, New Zealand.

出版信息

Data Brief. 2020 Dec 16;34:106655. doi: 10.1016/j.dib.2020.106655. eCollection 2021 Feb.

Abstract

This dataset includes psycholinguistic data on 694 English-language and 451 Dutch-language novels, acquired with computerised analysis of digitised novels published mainly between 1800 and 2018. The English-language novels have a total word count of 66.9 million words, while the Dutch-language novels comprise 49.6 million words, therefore offering large, representative samples for both languages. The data provided in this article include 93 linguistic and psycholinguistic outcome variables for the English-language novels, acquired using Linguistic Inquiry and Word Count (LIWC) version 2015, and 68 linguistic and psycholinguistic outcome variables for the Dutch-language novels, acquired using Linguistic Inquiry and Word Count (LIWC) version 2001. The dataset also includes word frequencies (unigram and bigram) for each novel. The metadata for each novel include year of publication, authors' nationality, sex, age at publication, and sexual orientation (the latter only in the English-language dataset), making it possible for researchers to study the data along these parameters. The use of these data can help researchers illuminate how word use reflects psychological processes in more than two centuries of literary art in English and in contemporary Dutch novels.

摘要

该数据集包含694部英语小说和451部荷兰语小说的心理语言学数据,这些数据是通过对主要在1800年至2018年出版的数字化小说进行计算机分析获得的。英语小说的总字数为6690万字,而荷兰语小说的字数为4960万字,因此为这两种语言提供了大量具有代表性的样本。本文提供的数据包括使用2015版语言调查与单词计数(LIWC)获得的93个英语小说的语言和心理语言学结果变量,以及使用2001版语言调查与单词计数(LIWC)获得的68个荷兰语小说的语言和心理语言学结果变量。该数据集还包括每部小说的词频(单字和双字)。每部小说的元数据包括出版年份、作者国籍、性别、出版时的年龄以及性取向(后者仅在英语数据集中),这使得研究人员能够沿着这些参数研究数据。使用这些数据可以帮助研究人员阐明在两个多世纪的英语文学艺术以及当代荷兰小说中,词汇使用是如何反映心理过程的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17a4/7772540/21988198bb8d/gr1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验