Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, 200240, China.
Behav Res Methods. 2024 Feb;56(2):651-666. doi: 10.3758/s13428-023-02068-7. Epub 2023 Feb 8.
Sentiment analysis in Chinese natural language processing has been largely based on words annotated with sentiment categories or scores. Characters, however, are the basic orthographic, phonological, and in most cases, semantic units in the Chinese language. This study collected sentiment annotations for 3827 characters. The ratings demonstrated high levels of reliability, and were validated through a comparison with the ratings of some characters' word equivalents reported in a previous norming study. Relations with other lexico-semantic variables and character processing efficiency were investigated. Furthermore, analyses of the association between constituent character valence and word valence revealed semantic compositionality and sentiment fusion characteristic of larger Chinese linguistic units. These ratings for characters, expanding current Chinese sentiment lexicons, can be utilized for the purposes of more precise stimuli assessment in research on Chinese character processing and more efficient sentiment analysis equipped with annotations of single-character words.
中文自然语言处理中的情感分析主要基于带有情感类别或分数的词汇。然而,汉字是中文的基本正字法、语音和语义单位。本研究收集了 3827 个汉字的情感标注。这些评分表现出了较高的可靠性,并通过与之前规范研究中报告的一些字符等价词的评分进行比较得到了验证。研究还探讨了与其他词汇语义变量和字符处理效率的关系。此外,对组成字符的情感与词的情感之间关系的分析揭示了较大的中文语言单位的语义组合性和情感融合特征。这些汉字的评分,扩展了现有的中文情感词汇,可用于汉字处理研究中更精确的刺激评估,以及配备单个字符词注释的更有效的情感分析。