Department of Computer Science, Fatima Jinnah Women University, Rawalpindi 46000, Pakistan.
Ubiquitous Computing Lab, Department of Computer Science and Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu, Yongin-si 17104, Republic of Korea.
Sensors (Basel). 2023 Apr 28;23(9):4373. doi: 10.3390/s23094373.
Multimodal emotion recognition has gained much traction in the field of affective computing, human-computer interaction (HCI), artificial intelligence (AI), and user experience (UX). There is growing demand to automate analysis of user emotion towards HCI, AI, and UX evaluation applications for providing affective services. Emotions are increasingly being used, obtained through the videos, audio, text or physiological signals. This has led to process emotions from multiple modalities, usually combined through ensemble-based systems with static weights. Due to numerous limitations like missing modality data, inter-class variations, and intra-class similarities, an effective weighting scheme is thus required to improve the aforementioned discrimination between modalities. This article takes into account the importance of difference between multiple modalities and assigns dynamic weights to them by adapting a more efficient combination process with the application of generalized mixture (GM) functions. Therefore, we present a hybrid multimodal emotion recognition (H-MMER) framework using multi-view learning approach for unimodal emotion recognition and introducing multimodal feature fusion level, and decision level fusion using GM functions. In an experimental study, we evaluated the ability of our proposed framework to model a set of four different emotional states (, , , and ) and found that most of them can be modeled well with significantly high accuracy using GM functions. The experiment shows that the proposed framework can model emotional states with an average accuracy of 98.19% and indicates significant gain in terms of performance in contrast to traditional approaches. The overall evaluation results indicate that we can identify emotional states with high accuracy and increase the robustness of an emotion classification system required for UX measurement.
多模态情感识别在情感计算、人机交互(HCI)、人工智能(AI)和用户体验(UX)领域受到了广泛关注。人们越来越需要自动化分析用户对 HCI、AI 和 UX 评估应用程序的情感,以提供情感服务。情感可以通过视频、音频、文本或生理信号来获取。这导致了从多种模态中处理情感,通常通过基于集成的系统并结合静态权重来实现。由于存在许多限制,例如模态数据缺失、类间变化和类内相似性等,因此需要有效的加权方案来提高上述模态之间的区分能力。本文考虑到多模态之间差异的重要性,并通过应用广义混合(GM)函数来自适应更有效的组合过程,为它们分配动态权重。因此,我们提出了一种使用多视图学习方法进行单模态情感识别的混合多模态情感识别(H-MMER)框架,并在 GM 函数的使用下引入了多模态特征融合层面和决策融合层面。在实验研究中,我们评估了我们提出的框架对一组四种不同情感状态(快乐、悲伤、愤怒和恐惧)进行建模的能力,并发现使用 GM 函数可以很好地对大多数情感状态进行建模,并且具有很高的准确性。实验表明,所提出的框架可以以 98.19%的平均准确率来对情感状态进行建模,并在性能方面与传统方法相比有显著提高。总体评估结果表明,我们可以以很高的准确率识别情感状态,并提高用于 UX 测量的情感分类系统的鲁棒性。