Institute of Psychology, Polish Academy of Sciences, SWPS University of Warsaw, Warsaw, Poland.
Faculty of Psychology, University of Warsaw, Warsaw, Poland.
Behav Res Methods. 2024 Aug;56(5):4716-4731. doi: 10.3758/s13428-023-02212-3. Epub 2023 Sep 25.
Data on the emotionality of words is important for the selection of experimental stimuli and sentiment analysis on large bodies of text. While norms for valence and arousal have been thoroughly collected in English, most languages do not have access to such large datasets. Moreover, theoretical developments lead to new dimensions being proposed, the norms for which are only partially available. In this paper, we propose a transformer-based neural network architecture for semantic and emotional norms extrapolation that predicts a whole ensemble of norms at once while achieving state-of-the-art correlations with human judgements on each. We improve on the previous approaches with regards to the correlations with human judgments by Δr = 0.1 on average. We precisely discuss the limitations of norm extrapolation as a whole, with a special focus on the introduced model. Further, we propose a unique practical application of our model by proposing a method of stimuli selection which performs unsupervised control by picking words that match in their semantic content. As the proposed model can easily be applied to different languages, we provide norm extrapolations for English, Polish, Dutch, German, French, and Spanish. To aid researchers, we also provide access to the extrapolation networks through an accessible web application.
关于单词情感的数据对于实验刺激的选择和大量文本的情感分析非常重要。虽然在英语中已经彻底收集了关于效价和唤醒度的规范,但大多数语言都无法获得如此大规模的数据集。此外,理论的发展导致提出了新的维度,这些维度的规范只部分可用。在本文中,我们提出了一种基于转换器的神经网络架构,用于语义和情感规范外推,该架构可以同时预测一整套规范,同时在与人类对每个规范的判断的相关性方面达到了最先进的水平。与人类判断的相关性方面,我们通过平均提高 0.1 的 Δr 来改进以前的方法。我们将详细讨论规范外推的整体局限性,特别关注所介绍的模型。此外,我们通过提出一种通过选择在语义内容上匹配的词来进行无监督控制的刺激选择方法,为我们的模型提供了一种独特的实际应用。由于所提出的模型可以轻松应用于不同的语言,因此我们还为英语、波兰语、荷兰语、德语、法语和西班牙语提供了规范外推。为了帮助研究人员,我们还通过可访问的 Web 应用程序提供了对外推网络的访问。