Yuvaraj Rajamanickam, Baranwal Arapan, Prince A Amalin, Murugappan M, Mohammed Javeed Shaikh
National Institute of Education, Nanyang Technological University, Singapore 637616, Singapore.
Department of Computer Science and Information Systems, BITS Pilani, Sancoale 403726, Goa, India.
Brain Sci. 2023 Apr 19;13(4):685. doi: 10.3390/brainsci13040685.
The recognition of emotions is one of the most challenging issues in human-computer interaction (HCI). EEG signals are widely adopted as a method for recognizing emotions because of their ease of acquisition, mobility, and convenience. Deep neural networks (DNN) have provided excellent results in emotion recognition studies. Most studies, however, use other methods to extract handcrafted features, such as Pearson correlation coefficient (PCC), Principal Component Analysis, Higuchi Fractal Dimension (HFD), etc., even though DNN is capable of generating meaningful features. Furthermore, most earlier studies largely ignored spatial information between the different channels, focusing mainly on time domain and frequency domain representations. This study utilizes a pre-trained 3D-CNN MobileNet model with transfer learning on the spatio-temporal representation of EEG signals to extract features for emotion recognition. In addition to fully connected layers, hybrid models were explored using other decision layers such as multilayer perceptron (MLP), k-nearest neighbor (KNN), extreme learning machine (ELM), XGBoost (XGB), random forest (RF), and support vector machine (SVM). Additionally, this study investigates the effects of post-processing or filtering output labels. Extensive experiments were conducted on the SJTU Emotion EEG Dataset (SEED) (three classes) and SEED-IV (four classes) datasets, and the results obtained were comparable to the state-of-the-art. Based on the conventional 3D-CNN with ELM classifier, SEED and SEED-IV datasets showed a maximum accuracy of 89.18% and 81.60%, respectively. Post-filtering improved the emotional classification performance in the hybrid 3D-CNN with ELM model for SEED and SEED-IV datasets to 90.85% and 83.71%, respectively. Accordingly, spatial-temporal features extracted from the EEG, along with ensemble classifiers, were found to be the most effective in recognizing emotions compared to state-of-the-art methods.
情感识别是人机交互(HCI)中最具挑战性的问题之一。由于脑电图(EEG)信号易于采集、具有移动性且方便,因此被广泛用作一种情感识别方法。深度神经网络(DNN)在情感识别研究中取得了优异的成果。然而,大多数研究使用其他方法来提取手工特征,如皮尔逊相关系数(PCC)、主成分分析、 Higuchi分形维数(HFD)等,尽管DNN能够生成有意义的特征。此外,大多数早期研究在很大程度上忽略了不同通道之间的空间信息,主要关注时域和频域表示。本研究利用预训练的3D-CNN MobileNet模型,并通过迁移学习对EEG信号的时空表示进行处理,以提取用于情感识别的特征。除了全连接层之外,还探索了使用其他决策层的混合模型,如多层感知器(MLP)、k近邻(KNN)、极限学习机(ELM)、XGBoost(XGB)、随机森林(RF)和支持向量机(SVM)。此外,本研究还调查了后处理或过滤输出标签的效果。在上海交通大学情感EEG数据集(SEED)(三类)和SEED-IV(四类)数据集上进行了广泛的实验,获得的结果与当前最优方法相当。基于传统的带有ELM分类器的3D-CNN,SEED和SEED-IV数据集分别显示出最高89.18%和81.60%的准确率。后过滤将带有ELM模型的混合3D-CNN在SEED和SEED-IV数据集上的情感分类性能分别提高到90.85%和83.71%。因此,与当前最优方法相比,从EEG中提取的时空特征以及集成分类器在情感识别中被发现是最有效的。