Yin Zhong, Zhao Mengyuan, Wang Yongxiong, Yang Jingdong, Zhang Jianhua
Engineering Research Center of Optical Instrument and System, Ministry of Education, Shanghai Key Lab of Modern Optical System, University of Shanghai for Science and Technology, Shanghai, 200093, PR China.
School of Social Sciences, University of Shanghai for Science and Technology, Shanghai, 200093, PR China.
Comput Methods Programs Biomed. 2017 Mar;140:93-110. doi: 10.1016/j.cmpb.2016.12.005. Epub 2016 Dec 15.
Using deep-learning methodologies to analyze multimodal physiological signals becomes increasingly attractive for recognizing human emotions. However, the conventional deep emotion classifiers may suffer from the drawback of the lack of the expertise for determining model structure and the oversimplification of combining multimodal feature abstractions.
In this study, a multiple-fusion-layer based ensemble classifier of stacked autoencoder (MESAE) is proposed for recognizing emotions, in which the deep structure is identified based on a physiological-data-driven approach. Each SAE consists of three hidden layers to filter the unwanted noise in the physiological features and derives the stable feature representations. An additional deep model is used to achieve the SAE ensembles. The physiological features are split into several subsets according to different feature extraction approaches with each subset separately encoded by a SAE. The derived SAE abstractions are combined according to the physiological modality to create six sets of encodings, which are then fed to a three-layer, adjacent-graph-based network for feature fusion. The fused features are used to recognize binary arousal or valence states.
DEAP multimodal database was employed to validate the performance of the MESAE. By comparing with the best existing emotion classifier, the mean of classification rate and F-score improves by 5.26%.
The superiority of the MESAE against the state-of-the-art shallow and deep emotion classifiers has been demonstrated under different sizes of the available physiological instances.
利用深度学习方法分析多模态生理信号在识别人类情绪方面变得越来越有吸引力。然而,传统的深度情感分类器可能存在缺乏确定模型结构的专业知识以及多模态特征抽象组合过于简单化的缺点。
在本研究中,提出了一种基于多层融合的堆叠自编码器集成分类器(MESAE)用于情绪识别,其中深度结构基于生理数据驱动的方法确定。每个自编码器由三个隐藏层组成,用于过滤生理特征中的不必要噪声并导出稳定的特征表示。使用一个额外的深度模型来实现自编码器集成。根据不同的特征提取方法将生理特征划分为几个子集,每个子集由一个自编码器单独编码。根据生理模态将导出的自编码器抽象进行组合以创建六组编码,然后将其输入到基于邻接图的三层网络进行特征融合。融合后的特征用于识别二元唤醒或效价状态。
使用DEAP多模态数据库验证MESAE的性能。与现有的最佳情感分类器相比,分类率和F分数的平均值提高了5.26%。
在不同规模的可用生理实例下,MESAE相对于最先进的浅层和深度情感分类器的优越性得到了证明。