基于 SVM 和 DBN 组合的智能情感服务中的汉语语音情感识别。

Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN.

机构信息

Department of Software Engineering, China University of Petroleum, No. 66 Changjiang West Road, Qingdao 266031, China.

Department of Information Processing Science, University of Oulu, Oulu FI-91004, Finland.

出版信息

Sensors (Basel). 2017 Jul 24;17(7):1694. doi: 10.3390/s17071694.

DOI:10.3390/s17071694

PMID:28737705

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5539696/

Abstract

Accurate emotion recognition from speech is important for applications like smart health care, smart entertainment, and other smart services. High accuracy emotion recognition from Chinese speech is challenging due to the complexities of the Chinese language. In this paper, we explore how to improve the accuracy of speech emotion recognition, including speech signal feature extraction and emotion classification methods. Five types of features are extracted from a speech sample: mel frequency cepstrum coefficient (MFCC), pitch, formant, short-term zero-crossing rate and short-term energy. By comparing statistical features with deep features extracted by a Deep Belief Network (DBN), we attempt to find the best features to identify the emotion status for speech. We propose a novel classification method that combines DBN and SVM (support vector machine) instead of using only one of them. In addition, a conjugate gradient method is applied to train DBN in order to speed up the training process. Gender-dependent experiments are conducted using an emotional speech database created by the Chinese Academy of Sciences. The results show that DBN features can reflect emotion status better than artificial features, and our new classification approach achieves an accuracy of 95.8%, which is higher than using either DBN or SVM separately. Results also show that DBN can work very well for small training databases if it is properly designed.

摘要

准确的语音情感识别对于智能医疗保健、智能娱乐和其他智能服务等应用非常重要。由于汉语语言的复杂性，从汉语语音中实现高精度的情感识别具有挑战性。在本文中，我们探讨了如何提高语音情感识别的准确性，包括语音信号特征提取和情感分类方法。从语音样本中提取了五种类型的特征：梅尔频率倒谱系数 (MFCC)、音高、共振峰、短时过零率和短时能量。通过比较统计特征和由深度置信网络 (DBN) 提取的深度特征，我们试图找到最佳的特征来识别语音的情感状态。我们提出了一种新的分类方法，将 DBN 和 SVM（支持向量机）结合起来，而不是只使用其中之一。此外，应用共轭梯度方法来训练 DBN，以加快训练过程。使用中国科学院创建的情感语音数据库进行了性别相关的实验。结果表明，DBN 特征比人工特征更能反映情感状态，我们的新分类方法的准确率达到 95.8%，高于单独使用 DBN 或 SVM。结果还表明，如果设计得当，DBN 可以在小型训练数据库中很好地工作。

相似文献

Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN.

Sensors (Basel). 2017 Jul 24;17(7):1694. doi: 10.3390/s17071694.

Random Deep Belief Networks for Recognizing Emotions from Speech Signals.

Comput Intell Neurosci. 2017;2017:1945630. doi: 10.1155/2017/1945630. Epub 2017 Mar 5.

Impact of Feature Selection Algorithm on Speech Emotion Recognition Using Deep Convolutional Neural Network.

Sensors (Basel). 2020 Oct 23;20(21):6008. doi: 10.3390/s20216008.

[Research of electroencephalography representational emotion recognition based on deep belief networks].

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2018 Apr 25;35(2):182-190. doi: 10.7507/1001-5515.201706035.

Effect on speech emotion classification of a feature selection approach using a convolutional neural network.

PeerJ Comput Sci. 2021 Nov 3;7:e766. doi: 10.7717/peerj-cs.766. eCollection 2021.

An Urdu speech for emotion recognition.

PeerJ Comput Sci. 2022 May 9;8:e954. doi: 10.7717/peerj-cs.954. eCollection 2022.

Research on Chorus Emotion Recognition and Intelligent Medical Application Based on Health Big Data.

J Healthc Eng. 2022 Mar 9;2022:1363690. doi: 10.1155/2022/1363690. eCollection 2022.

Recognition of Emotions Using Multichannel EEG Data and DBN-GC-Based Ensemble Deep Learning Framework.

Comput Intell Neurosci. 2018 Dec 13;2018:9750904. doi: 10.1155/2018/9750904. eCollection 2018.

Deep generative learning for automated EHR diagnosis of traditional Chinese medicine.

Comput Methods Programs Biomed. 2019 Jun;174:17-23. doi: 10.1016/j.cmpb.2018.05.008. Epub 2018 May 4.

Research on Chinese Speech Emotion Recognition Based on Deep Neural Network and Acoustic Features.

Sensors (Basel). 2022 Jun 23;22(13):4744. doi: 10.3390/s22134744.

引用本文的文献

A multi-dilated convolution network for speech emotion recognition.

Sci Rep. 2025 Mar 10;15(1):8254. doi: 10.1038/s41598-025-92640-2.

Comparison of Different Machine Learning Models for Predicting Long-Term Overall Survival in Non-metastatic Colorectal Cancers.

Cureus. 2024 Dec 14;16(12):e75713. doi: 10.7759/cureus.75713. eCollection 2024 Dec.

An enhanced speech emotion recognition using vision transformer.

Sci Rep. 2024 Jun 7;14(1):13126. doi: 10.1038/s41598-024-63776-4.

Development and application of emotion recognition technology - a systematic literature review.

BMC Psychol. 2024 Feb 24;12(1):95. doi: 10.1186/s40359-024-01581-4.

Speech emotion classification using attention based network and regularized feature selection.

Sci Rep. 2023 Jul 25;13(1):11990. doi: 10.1038/s41598-023-38868-2.

Speech Emotion Recognition Using Convolution Neural Networks and Multi-Head Convolutional Transformer.

Sensors (Basel). 2023 Jul 7;23(13):6212. doi: 10.3390/s23136212.

An Urdu speech for emotion recognition.

PeerJ Comput Sci. 2022 May 9;8:e954. doi: 10.7717/peerj-cs.954. eCollection 2022.

Research on Multifeature Intelligent Correction of Spoken English.

Comput Intell Neurosci. 2022 Jan 27;2022:8241241. doi: 10.1155/2022/8241241. eCollection 2022.

The Impact of Attention Mechanisms on Speech Emotion Recognition.

Sensors (Basel). 2021 Nov 12;21(22):7530. doi: 10.3390/s21227530.

Hybrid Method of Automated EEG Signals' Selection Using Reversed Correlation Algorithm for Improved Classification of Emotions.

Sensors (Basel). 2020 Dec 10;20(24):7083. doi: 10.3390/s20247083.

本文引用的文献

Time-frequency feature representation using multi-resolution texture analysis and acoustic activity detector for real-life speech emotion recognition.

Sensors (Basel). 2015 Jan 14;15(1):1458-78. doi: 10.3390/s150101458.

Speaker-sensitive emotion recognition via ranking: Studies on acted and spontaneous speech.

Comput Speech Lang. 2015 Jan;28(1):186-202. doi: 10.1016/j.csl.2014.01.003.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于 SVM 和 DBN 组合的智能情感服务中的汉语语音情感识别。

Emotion Recognition from Chinese Speech for Smart Affective Services Using a Combination of SVM and DBN.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献