• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于通用语音表示的语音情感分析和情绪识别。

Sentiment Analysis and Emotion Recognition from Speech Using Universal Speech Representations.

机构信息

National Institute of Advanced Industrial Science and Technology, Tsukuba 305-8560, Japan.

出版信息

Sensors (Basel). 2022 Aug 24;22(17):6369. doi: 10.3390/s22176369.

DOI:10.3390/s22176369
PMID:36080828
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9460459/
Abstract

The study of understanding sentiment and emotion in speech is a challenging task in human multimodal language. However, in certain cases, such as telephone calls, only audio data can be obtained. In this study, we independently evaluated sentiment analysis and emotion recognition from speech using recent self-supervised learning models-specifically, universal speech representations with speaker-aware pre-training models. Three different sizes of universal models were evaluated for three sentiment tasks and an emotion task. The evaluation revealed that the best results were obtained with two classes of sentiment analysis, based on both weighted and unweighted accuracy scores (81% and 73%). This binary classification with unimodal acoustic analysis also performed competitively compared to previous methods which used multimodal fusion. The models failed to make accurate predictionsin an emotion recognition task and in sentiment analysis tasks with higher numbers of classes. The unbalanced property of the datasets may also have contributed to the performance degradations observed in the six-class emotion, three-class sentiment, and seven-class sentiment tasks.

摘要

研究语音中的情感和情绪理解是人类多模态语言中的一项具有挑战性的任务。然而,在某些情况下,如电话通话,只能获得音频数据。在本研究中,我们使用最新的自监督学习模型——特别是带有说话人感知预训练模型的通用语音表示,独立评估了语音的情感分析和情绪识别。我们评估了三种不同大小的通用模型在三个情感任务和一个情绪任务中的表现。评估结果表明,基于加权和非加权准确率(81%和 73%),两种情感分析方法的效果最好。这种基于单模态声学分析的二分类方法与之前使用多模态融合的方法相比,表现也很有竞争力。模型在情绪识别任务和具有更多类别的情感分析任务中无法做出准确的预测。数据集的不平衡性质也可能导致在六类情绪、三类情感和七类情感任务中观察到的性能下降。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6481/9460459/e3074edb5b88/sensors-22-06369-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6481/9460459/e3074edb5b88/sensors-22-06369-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6481/9460459/e3074edb5b88/sensors-22-06369-g005.jpg

相似文献

1
Sentiment Analysis and Emotion Recognition from Speech Using Universal Speech Representations.基于通用语音表示的语音情感分析和情绪识别。
Sensors (Basel). 2022 Aug 24;22(17):6369. doi: 10.3390/s22176369.
2
Audio-Based Emotion Recognition Using Self-Supervised Learning on an Engineered Feature Space.基于音频的情感识别:在工程特征空间上使用自监督学习
AI (Basel). 2024 Mar;5(1):195-207. doi: 10.3390/ai5010011. Epub 2024 Jan 17.
3
Weighted Joint Sentiment-Topic Model for Sentiment Analysis Compared to ALGA: Adaptive Lexicon Learning Using Genetic Algorithm.加权联合情感-主题模型在情感分析中的应用比较 与 ALGA 相比:基于遗传算法的自适应词汇学习。
Comput Intell Neurosci. 2022 Jul 31;2022:7612276. doi: 10.1155/2022/7612276. eCollection 2022.
4
AFR-BERT: Attention-based mechanism feature relevance fusion multimodal sentiment analysis model.AFR-BERT:基于注意力机制的特征相关融合多模态情感分析模型。
PLoS One. 2022 Sep 9;17(9):e0273936. doi: 10.1371/journal.pone.0273936. eCollection 2022.
5
Research on Chinese Speech Emotion Recognition Based on Deep Neural Network and Acoustic Features.基于深度神经网络和声学特征的汉语语音情感识别研究。
Sensors (Basel). 2022 Jun 23;22(13):4744. doi: 10.3390/s22134744.
6
BMT-Net: Broad Multitask Transformer Network for Sentiment Analysis.BMT-Net:用于情感分析的广义多任务变压器网络。
IEEE Trans Cybern. 2022 Jul;52(7):6232-6243. doi: 10.1109/TCYB.2021.3050508. Epub 2022 Jul 4.
7
Cross-Modal Sentiment Sensing with Visual-Augmented Representation and Diverse Decision Fusion.跨模态情感感知与视觉增强表示和多样化决策融合。
Sensors (Basel). 2021 Dec 23;22(1):74. doi: 10.3390/s22010074.
8
A comprehensive study on bilingual and multilingual speech emotion recognition using a two-pass classification scheme.使用双通分类方案进行双语和多语语音情感识别的综合研究。
PLoS One. 2019 Aug 15;14(8):e0220386. doi: 10.1371/journal.pone.0220386. eCollection 2019.
9
Multimodal Emotion Detection via Attention-Based Fusion of Extracted Facial and Speech Features.基于提取的面部和语音特征的注意力融合的多模态情感检测。
Sensors (Basel). 2023 Jun 9;23(12):5475. doi: 10.3390/s23125475.
10
Semi-supervised distributed representations of documents for sentiment analysis.用于情感分析的文档的半监督分布式表示。
Neural Netw. 2019 Nov;119:139-150. doi: 10.1016/j.neunet.2019.08.001. Epub 2019 Aug 6.

引用本文的文献

1
Challenges and standardisation strategies for sensor-based data collection for digital phenotyping.数字表型中基于传感器的数据收集面临的挑战与标准化策略
Commun Med (Lond). 2025 Aug 19;5(1):360. doi: 10.1038/s43856-025-01013-3.
2
Voice analysis and deep learning for detecting mental disorders in pregnant women: a cross-sectional study.用于检测孕妇精神障碍的语音分析与深度学习:一项横断面研究。
Discov Ment Health. 2025 Feb 8;5(1):12. doi: 10.1007/s44192-025-00138-0.
3
Facial Expression Recognition for Measuring Jurors' Attention in Acoustic Jury Tests.

本文引用的文献

1
Dawn of the Transformer Era in Speech Emotion Recognition: Closing the Valence Gap.语音情感识别的变革时代黎明:弥合效价鸿沟。
IEEE Trans Pattern Anal Mach Intell. 2023 Sep;45(9):10745-10759. doi: 10.1109/TPAMI.2023.3263585. Epub 2023 Aug 7.
2
Multimodal Routing: Improving Local and Global Interpretability of Multimodal Language Analysis.多模态路由:提高多模态语言分析的局部和全局可解释性
Proc Conf Empir Methods Nat Lang Process. 2020 Nov;2020:1823-1833. doi: 10.18653/v1/2020.emnlp-main.143.
3
Multimodal Transformer for Unaligned Multimodal Language Sequences.
用于测量声学陪审团测试中陪审员注意力的面部表情识别。
Sensors (Basel). 2024 Apr 4;24(7):2298. doi: 10.3390/s24072298.
4
Audio-Visual Fusion Based on Interactive Attention for Person Verification.基于交互注意力的视听融合的人像验证。
Sensors (Basel). 2023 Dec 15;23(24):9845. doi: 10.3390/s23249845.
5
Enhancing Speech Emotion Recognition Using Dual Feature Extraction Encoders.利用双特征提取编码器增强语音情感识别。
Sensors (Basel). 2023 Jul 24;23(14):6640. doi: 10.3390/s23146640.
6
Emotion Detection Based on Pupil Variation.基于瞳孔变化的情绪检测
Healthcare (Basel). 2023 Jan 21;11(3):322. doi: 10.3390/healthcare11030322.
用于未对齐多模态语言序列的多模态变换器
Proc Conf Assoc Comput Linguist Meet. 2019 Jul;2019:6558-6569. doi: 10.18653/v1/p19-1656.
4
Words Can Shift: Dynamically Adjusting Word Representations Using Nonverbal Behaviors.词语可变换:利用非语言行为动态调整词语表征
Proc AAAI Conf Artif Intell. 2019 Jul;33(1):7216-7223.
5
Does Neutral Affect Exist? How Challenging Three Beliefs About Neutral Affect Can Advance Affective Research.中性情绪是否存在?对关于中性情绪的三种信念提出质疑如何推动情感研究。
Front Psychol. 2019 Nov 8;10:2476. doi: 10.3389/fpsyg.2019.02476. eCollection 2019.
6
Basic Emotions, Natural Kinds, Emotion Schemas, and a New Paradigm.基本情绪、自然种类、情绪图式和新范式。
Perspect Psychol Sci. 2007 Sep;2(3):260-80. doi: 10.1111/j.1745-6916.2007.00044.x.