构建乳腺癌患者情绪词汇库:编制与情感分析

Construction of an Emotional Lexicon of Patients With Breast Cancer: Development and Sentiment Analysis.

机构信息

Nanfang Hospital, Southern Medical University, Guangzhou, China.

School of Nursing, Southern Medical University, Guangzhou, China.

出版信息

J Med Internet Res. 2023 Sep 12;25:e44897. doi: 10.2196/44897.

Abstract

BACKGROUND

The innovative method of sentiment analysis based on an emotional lexicon shows prominent advantages in capturing emotional information, such as individual attitudes, experiences, and needs, which provides a new perspective and method for emotion recognition and management for patients with breast cancer (BC). However, at present, sentiment analysis in the field of BC is limited, and there is no emotional lexicon for this field. Therefore, it is necessary to construct an emotional lexicon that conforms to the characteristics of patients with BC so as to provide a new tool for accurate identification and analysis of the patients' emotions and a new method for their personalized emotion management.

OBJECTIVE

This study aimed to construct an emotional lexicon of patients with BC.

METHODS

Emotional words were obtained by merging the words in 2 general sentiment lexicons, the Chinese Linguistic Inquiry and Word Count (C-LIWC) and HowNet, and the words in text corpora acquired from patients with BC via Weibo, semistructured interviews, and expressive writing. The lexicon was constructed using manual annotation and classification under the guidance of Russell's valence-arousal space. Ekman's basic emotional categories, Lazarus' cognitive appraisal theory of emotion, and a qualitative text analysis based on the text corpora of patients with BC were combined to determine the fine-grained emotional categories of the lexicon we constructed. Precision, recall, and the F1-score were used to evaluate the lexicon's performance.

RESULTS

The text corpora collected from patients in different stages of BC included 150 written materials, 17 interviews, and 6689 original posts and comments from Weibo, with a total of 1,923,593 Chinese characters. The emotional lexicon of patients with BC contained 9357 words and covered 8 fine-grained emotional categories: joy, anger, sadness, fear, disgust, surprise, somatic symptoms, and BC terminology. Experimental results showed that precision, recall, and the F1-score of positive emotional words were 98.42%, 99.73%, and 99.07%, respectively, and those of negative emotional words were 99.73%, 98.38%, and 99.05%, respectively, which all significantly outperformed the C-LIWC and HowNet.

CONCLUSIONS

The emotional lexicon with fine-grained emotional categories conforms to the characteristics of patients with BC. Its performance related to identifying and classifying domain-specific emotional words in BC is better compared to the C-LIWC and HowNet. This lexicon not only provides a new tool for sentiment analysis in the field of BC but also provides a new perspective for recognizing the specific emotional state and needs of patients with BC and formulating tailored emotional management plans.

摘要

背景

基于情感词典的情感分析创新方法在捕捉个体态度、体验和需求等情感信息方面表现出突出优势,为乳腺癌患者的情绪识别和管理提供了新的视角和方法。然而,目前乳腺癌领域的情感分析较为局限,且缺乏该领域的情感词典。因此,有必要构建一个符合乳腺癌患者特点的情感词典,为准确识别和分析患者的情绪提供新工具,为其个性化的情绪管理提供新方法。

目的

本研究旨在构建乳腺癌患者的情感词典。

方法

通过合并 2 个通用情感词典(即中文词汇分析和词频统计(C-LIWC)和知网)中的词汇以及通过微博从乳腺癌患者获取的文本语料库中的词汇,获得情感词汇。在 Russell 的效价唤醒空间的指导下,通过手动注释和分类构建词典。结合 Ekman 的基本情绪类别、Lazarus 的情绪认知评估理论以及基于乳腺癌患者文本语料库的定性文本分析,确定词典中精细的情绪类别。使用精确率、召回率和 F1 分数评估词典的性能。

结果

从不同阶段的乳腺癌患者收集的文本语料库包括 150 篇书面材料、17 次访谈和 6689 条微博原始帖子和评论,共 1923593 个汉字。乳腺癌患者的情感词典包含 9357 个词汇,涵盖 8 个精细的情绪类别:喜悦、愤怒、悲伤、恐惧、厌恶、惊讶、躯体症状和乳腺癌术语。实验结果表明,正性情绪词的精确率、召回率和 F1 分数分别为 98.42%、99.73%和 99.07%,负性情绪词的精确率、召回率和 F1 分数分别为 99.73%、98.38%和 99.05%,均显著优于 C-LIWC 和知网。

结论

具有精细情绪类别的情感词典符合乳腺癌患者的特点。其在乳腺癌领域中识别和分类特定情绪词的性能优于 C-LIWC 和知网。该词典不仅为乳腺癌领域的情感分析提供了新工具,也为识别乳腺癌患者特定的情绪状态和需求以及制定个性化的情绪管理计划提供了新视角。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0754/10523220/fcaf3d2ffa86/jmir_v25i1e44897_fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索