Suppr超能文献

基于通用语音表示的语音情感分析和情绪识别。

Sentiment Analysis and Emotion Recognition from Speech Using Universal Speech Representations.

机构信息

National Institute of Advanced Industrial Science and Technology, Tsukuba 305-8560, Japan.

出版信息

Sensors (Basel). 2022 Aug 24;22(17):6369. doi: 10.3390/s22176369.

Abstract

The study of understanding sentiment and emotion in speech is a challenging task in human multimodal language. However, in certain cases, such as telephone calls, only audio data can be obtained. In this study, we independently evaluated sentiment analysis and emotion recognition from speech using recent self-supervised learning models-specifically, universal speech representations with speaker-aware pre-training models. Three different sizes of universal models were evaluated for three sentiment tasks and an emotion task. The evaluation revealed that the best results were obtained with two classes of sentiment analysis, based on both weighted and unweighted accuracy scores (81% and 73%). This binary classification with unimodal acoustic analysis also performed competitively compared to previous methods which used multimodal fusion. The models failed to make accurate predictionsin an emotion recognition task and in sentiment analysis tasks with higher numbers of classes. The unbalanced property of the datasets may also have contributed to the performance degradations observed in the six-class emotion, three-class sentiment, and seven-class sentiment tasks.

摘要

研究语音中的情感和情绪理解是人类多模态语言中的一项具有挑战性的任务。然而,在某些情况下,如电话通话,只能获得音频数据。在本研究中,我们使用最新的自监督学习模型——特别是带有说话人感知预训练模型的通用语音表示,独立评估了语音的情感分析和情绪识别。我们评估了三种不同大小的通用模型在三个情感任务和一个情绪任务中的表现。评估结果表明,基于加权和非加权准确率(81%和 73%),两种情感分析方法的效果最好。这种基于单模态声学分析的二分类方法与之前使用多模态融合的方法相比,表现也很有竞争力。模型在情绪识别任务和具有更多类别的情感分析任务中无法做出准确的预测。数据集的不平衡性质也可能导致在六类情绪、三类情感和七类情感任务中观察到的性能下降。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6481/9460459/e3074edb5b88/sensors-22-06369-g005.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验