Lee Sanghyub John, Lim JongYoon, Paas Leo, Ahn Ho Seok
Marketing Department, University of Auckland Business School, Auckland, 1142 New Zealand.
CARES, Department of Electrical, Computer and Software Engineering, University of Auckland, Auckland, 1142 New Zealand.
Neural Comput Appl. 2023;35(15):10945-10956. doi: 10.1007/s00521-023-08276-8. Epub 2023 Jan 26.
Tactics to determine the emotions of authors of texts such as Twitter messages often rely on multiple annotators who label relatively small data sets of text passages. An alternative method gathers large text databases that contain the authors' self-reported emotions, to which artificial intelligence, machine learning, and natural language processing tools can be applied. Both approaches have strength and weaknesses. Emotions evaluated by a few human annotators are susceptible to idiosyncratic biases that reflect the characteristics of the annotators. But models based on large, self-reported emotion data sets may overlook subtle, social emotions that human annotators can recognize. In seeking to establish a means to train emotion detection models so that they can achieve good performance in different contexts, the current study proposes a novel transformer transfer learning approach that parallels human development stages: (1) detect emotions reported by the texts' authors and (2) synchronize the model with social emotions identified in annotator-rated emotion data sets. The analysis, based on a large, novel, self-reported emotion data set ( = 3,654,544) and applied to 10 previously published data sets, shows that the transfer learning emotion model achieves relatively strong performance.
确定推特消息等文本作者情绪的策略通常依赖于多个注释者,他们对相对较小的文本段落数据集进行标注。另一种方法是收集包含作者自我报告情绪的大型文本数据库,人工智能、机器学习和自然语言处理工具可以应用于此。这两种方法都有优缺点。由少数人类注释者评估的情绪容易受到反映注释者特征的特殊偏差的影响。但基于大型自我报告情绪数据集的模型可能会忽略人类注释者能够识别的微妙社会情绪。在寻求建立一种训练情绪检测模型的方法,使其能够在不同情境下取得良好性能时,当前研究提出了一种新颖的Transformer迁移学习方法,该方法与人类发展阶段并行:(1)检测文本作者报告的情绪,(2)使模型与注释者评级情绪数据集中识别出的社会情绪同步。基于一个大型、新颖的自我报告情绪数据集(n = 3,654,544)并应用于10个先前发表的数据集的分析表明,迁移学习情绪模型取得了相对较强的性能。