Lee Ju-Hwan, Kim Jin-Young, Kim Hyoung-Gook
Department of Intelligent Electronics and Computer Engineering, Chonnam National University, 77 Yongbong-ro, Buk-gu, Gwangju 61186, Republic of Korea.
Department of Electronic Convergence Engineering, Kwangwoon University, 20 Gwangun-ro, Nowon-gu, Seoul 01897, Republic of Korea.
Bioengineering (Basel). 2024 Oct 3;11(10):997. doi: 10.3390/bioengineering11100997.
Multimodal emotion recognition has emerged as a promising approach to capture the complex nature of human emotions by integrating information from various sources such as physiological signals, visual behavioral cues, and audio-visual content. However, current methods often struggle with effectively processing redundant or conflicting information across modalities and may overlook implicit inter-modal correlations. To address these challenges, this paper presents a novel multimodal emotion recognition framework which integrates audio-visual features with viewers' EEG data to enhance emotion classification accuracy. The proposed approach employs modality-specific encoders to extract spatiotemporal features, which are then aligned through contrastive learning to capture inter-modal relationships. Additionally, cross-modal attention mechanisms are incorporated for effective feature fusion across modalities. The framework, comprising pre-training, fine-tuning, and testing phases, is evaluated on multiple datasets of emotional responses. The experimental results demonstrate that the proposed multimodal approach, which combines audio-visual features with EEG data, is highly effective in recognizing emotions, highlighting its potential for advancing emotion recognition systems.
多模态情感识别已成为一种很有前景的方法,通过整合来自各种来源的信息(如生理信号、视觉行为线索和视听内容)来捕捉人类情感的复杂本质。然而,当前的方法在有效处理跨模态的冗余或冲突信息方面往往存在困难,并且可能会忽略隐含的模态间相关性。为了应对这些挑战,本文提出了一种新颖的多模态情感识别框架,该框架将视听特征与观众的脑电图(EEG)数据相结合,以提高情感分类的准确性。所提出的方法采用特定模态的编码器来提取时空特征,然后通过对比学习进行对齐,以捕捉模态间的关系。此外,还引入了跨模态注意力机制,以实现跨模态的有效特征融合。该框架包括预训练、微调及测试阶段,并在多个情感反应数据集上进行了评估。实验结果表明,所提出的将视听特征与EEG数据相结合的多模态方法在情感识别方面非常有效,凸显了其在推进情感识别系统方面的潜力。