• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于多任务学习和子域自适应的跨语料库语音情感识别

Cross-Corpus Speech Emotion Recognition Based on Multi-Task Learning and Subdomain Adaptation.

作者信息

Fu Hongliang, Zhuang Zhihao, Wang Yang, Huang Chen, Duan Wenzhuo

机构信息

College of Information Science and Engineering, Henan University of Technology, Zhengzhou 450001, China.

Henan Engineering Laboratory of Grain IOT Technology, Henan University of Technology, Zhengzhou 450001, China.

出版信息

Entropy (Basel). 2023 Jan 7;25(1):124. doi: 10.3390/e25010124.

DOI:10.3390/e25010124
PMID:36673265
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9858266/
Abstract

To solve the problem of feature distribution discrepancy in cross-corpus speech emotion recognition tasks, this paper proposed an emotion recognition model based on multi-task learning and subdomain adaptation, which alleviates the impact on emotion recognition. Existing methods have shortcomings in speech feature representation and cross-corpus feature distribution alignment. The proposed model uses a deep denoising auto-encoder as a shared feature extraction network for multi-task learning, and the fully connected layer and softmax layer are added before each recognition task as task-specific layers. Subsequently, the subdomain adaptation algorithm of emotion and gender features is added to the shared network to obtain the shared emotion features and gender features of the source domain and target domain, respectively. Multi-task learning effectively enhances the representation ability of features, a subdomain adaptive algorithm promotes the migrating ability of features and effectively alleviates the impact of feature distribution differences in emotional features. The average results of six cross-corpus speech emotion recognition experiments show that, compared with other models, the weighted average recall rate is increased by 1.89~10.07%, the experimental results verify the validity of the proposed model.

摘要

为解决跨语料库语音情感识别任务中的特征分布差异问题,本文提出了一种基于多任务学习和子域自适应的情感识别模型,该模型减轻了对情感识别的影响。现有方法在语音特征表示和跨语料库特征分布对齐方面存在不足。所提出的模型使用深度去噪自动编码器作为多任务学习的共享特征提取网络,并在每个识别任务之前添加全连接层和softmax层作为特定任务层。随后,将情感和性别特征的子域自适应算法添加到共享网络中,分别获得源域和目标域的共享情感特征和性别特征。多任务学习有效地增强了特征的表示能力,子域自适应算法促进了特征的迁移能力,并有效减轻了情感特征中特征分布差异的影响。六个跨语料库语音情感识别实验的平均结果表明,与其他模型相比,加权平均召回率提高了1.89%~10.07%,实验结果验证了所提模型的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bb6/9858266/819ac1904b37/entropy-25-00124-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bb6/9858266/734706160b8b/entropy-25-00124-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bb6/9858266/819ac1904b37/entropy-25-00124-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bb6/9858266/734706160b8b/entropy-25-00124-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bb6/9858266/819ac1904b37/entropy-25-00124-g002.jpg

相似文献

1
Cross-Corpus Speech Emotion Recognition Based on Multi-Task Learning and Subdomain Adaptation.基于多任务学习和子域自适应的跨语料库语音情感识别
Entropy (Basel). 2023 Jan 7;25(1):124. doi: 10.3390/e25010124.
2
Progressively Discriminative Transfer Network for Cross-Corpus Speech Emotion Recognition.用于跨语料库语音情感识别的渐进式判别转移网络
Entropy (Basel). 2022 Jul 29;24(8):1046. doi: 10.3390/e24081046.
3
Cross-Corpus Speech Emotion Recognition Based on Transfer Learning and Multi-Loss Dynamic Adjustment.基于迁移学习和多损失动态调整的跨语料库语音情感识别。
Comput Intell Neurosci. 2022 Sep 20;2022:5019384. doi: 10.1155/2022/5019384. eCollection 2022.
4
Progressive distribution adapted neural networks for cross-corpus speech emotion recognition.用于跨语料库语音情感识别的渐进分布自适应神经网络。
Front Neurorobot. 2022 Sep 15;16:987146. doi: 10.3389/fnbot.2022.987146. eCollection 2022.
5
Adapting Multiple Distributions for Bridging Emotions from Different Speech Corpora.适配多种分布以弥合不同语音语料库中的情感差异。
Entropy (Basel). 2022 Sep 5;24(9):1250. doi: 10.3390/e24091250.
6
Vector learning representation for generalized speech emotion recognition.用于广义语音情感识别的向量学习表示。
Heliyon. 2022 Mar 28;8(3):e09196. doi: 10.1016/j.heliyon.2022.e09196. eCollection 2022 Mar.
7
Multi-Stream Convolution-Recurrent Neural Networks Based on Attention Mechanism Fusion for Speech Emotion Recognition.基于注意力机制融合的多流卷积循环神经网络用于语音情感识别
Entropy (Basel). 2022 Jul 26;24(8):1025. doi: 10.3390/e24081025.
8
Multi-source domain transfer network based on subdomain adaptation and minimum class confusion for EEG emotion recognition.基于子域自适应和最小类别混淆的多源域转移网络用于脑电情感识别
Comput Methods Biomech Biomed Engin. 2024 Oct 21:1-13. doi: 10.1080/10255842.2024.2417212.
9
Fusion-ConvBERT: Parallel Convolution and BERT Fusion for Speech Emotion Recognition.融合卷积-BERT:语音情感识别的并行卷积和 BERT 融合。
Sensors (Basel). 2020 Nov 23;20(22):6688. doi: 10.3390/s20226688.
10
A New Network Structure for Speech Emotion Recognition Research.用于语音情感识别研究的新型网络结构。
Sensors (Basel). 2024 Feb 22;24(5):1429. doi: 10.3390/s24051429.

引用本文的文献

1
A Survey of Deep Learning-Based Multimodal Emotion Recognition: Speech, Text, and Face.基于深度学习的多模态情感识别综述:语音、文本和面部
Entropy (Basel). 2023 Oct 12;25(10):1440. doi: 10.3390/e25101440.

本文引用的文献

1
Ensemble Approach on Deep and Handcrafted Features for Neonatal Bowel Sound Detection.基于深度特征和手工特征的集成方法在新生儿肠鸣音检测中的应用。
IEEE J Biomed Health Inform. 2023 Jun;27(6):2603-2613. doi: 10.1109/JBHI.2022.3217559. Epub 2023 Jun 5.
2
3DCANN: A Spatio-Temporal Convolution Attention Neural Network for EEG Emotion Recognition.3DCANN:用于 EEG 情绪识别的时空卷积注意力神经网络。
IEEE J Biomed Health Inform. 2022 Nov;26(11):5321-5331. doi: 10.1109/JBHI.2021.3083525. Epub 2022 Nov 10.
3
Deep Subdomain Adaptation Network for Image Classification.
用于图像分类的深度子域适应网络
IEEE Trans Neural Netw Learn Syst. 2021 Apr;32(4):1713-1722. doi: 10.1109/TNNLS.2020.2988928. Epub 2021 Apr 2.