Suppr超能文献

基于多任务学习和子域自适应的跨语料库语音情感识别

Cross-Corpus Speech Emotion Recognition Based on Multi-Task Learning and Subdomain Adaptation.

作者信息

Fu Hongliang, Zhuang Zhihao, Wang Yang, Huang Chen, Duan Wenzhuo

机构信息

College of Information Science and Engineering, Henan University of Technology, Zhengzhou 450001, China.

Henan Engineering Laboratory of Grain IOT Technology, Henan University of Technology, Zhengzhou 450001, China.

出版信息

Entropy (Basel). 2023 Jan 7;25(1):124. doi: 10.3390/e25010124.

Abstract

To solve the problem of feature distribution discrepancy in cross-corpus speech emotion recognition tasks, this paper proposed an emotion recognition model based on multi-task learning and subdomain adaptation, which alleviates the impact on emotion recognition. Existing methods have shortcomings in speech feature representation and cross-corpus feature distribution alignment. The proposed model uses a deep denoising auto-encoder as a shared feature extraction network for multi-task learning, and the fully connected layer and softmax layer are added before each recognition task as task-specific layers. Subsequently, the subdomain adaptation algorithm of emotion and gender features is added to the shared network to obtain the shared emotion features and gender features of the source domain and target domain, respectively. Multi-task learning effectively enhances the representation ability of features, a subdomain adaptive algorithm promotes the migrating ability of features and effectively alleviates the impact of feature distribution differences in emotional features. The average results of six cross-corpus speech emotion recognition experiments show that, compared with other models, the weighted average recall rate is increased by 1.89~10.07%, the experimental results verify the validity of the proposed model.

摘要

为解决跨语料库语音情感识别任务中的特征分布差异问题,本文提出了一种基于多任务学习和子域自适应的情感识别模型,该模型减轻了对情感识别的影响。现有方法在语音特征表示和跨语料库特征分布对齐方面存在不足。所提出的模型使用深度去噪自动编码器作为多任务学习的共享特征提取网络,并在每个识别任务之前添加全连接层和softmax层作为特定任务层。随后,将情感和性别特征的子域自适应算法添加到共享网络中,分别获得源域和目标域的共享情感特征和性别特征。多任务学习有效地增强了特征的表示能力,子域自适应算法促进了特征的迁移能力,并有效减轻了情感特征中特征分布差异的影响。六个跨语料库语音情感识别实验的平均结果表明,与其他模型相比,加权平均召回率提高了1.89%~10.07%,实验结果验证了所提模型的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9bb6/9858266/734706160b8b/entropy-25-00124-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验