Electrical Engineering and Electronic Information, Xihua University, Chengdu, China.
The 52nd Research Institude of China Electronics Technology Group Corporation, Haidian, China.
PLoS One. 2022 Aug 15;17(8):e0267132. doi: 10.1371/journal.pone.0267132. eCollection 2022.
In the field of Human-Computer Interaction (HCI), speech emotion recognition technology plays an important role. Facing a small number of speech emotion data, a novel speech emotion recognition method based on feature construction and ensemble learning is proposed in this paper. Firstly, the acoustic features are extracted from the speech signal and combined to form different original feature sets. Secondly, based on Light Gradient Boosting Machine (LightGBM) and Sequential Forward Selection (SFS) method, a novel feature selection method named L-SFS is proposed. And then, the softmax regression model is used to learn automatically the weights of the four single weak learners including Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Extreme Gradient Boosting (XGBoost) and LightGBM. Lastly, based on the learned automatically weights and the weighted average probability voting strategy, an ensemble classification model named Sklex is constructed, which integrates the above four single weak learners. In conclusion, the method reflects the effectiveness of feature construction and the superiority and stability of ensemble learning, and gets good speech emotion recognition accuracy.
在人机交互(HCI)领域,语音情感识别技术发挥着重要作用。针对语音情感数据较少的问题,本文提出了一种新的基于特征构建和集成学习的语音情感识别方法。首先,从语音信号中提取声学特征并进行组合,形成不同的原始特征集。其次,基于 Light Gradient Boosting Machine(LightGBM)和Sequential Forward Selection(SFS)方法,提出了一种新的特征选择方法,称为 L-SFS。然后,使用 softmax 回归模型自动学习包括支持向量机(SVM)、K-最近邻(KNN)、极端梯度提升(XGBoost)和 LightGBM 在内的四个单弱分类器的权重。最后,基于学习到的自动权重和加权平均概率投票策略,构建了一个集成分类模型 Sklex,集成了上述四个单弱分类器。综上所述,该方法体现了特征构建的有效性和集成学习的优越性和稳定性,获得了良好的语音情感识别准确率。