Sun Jie, Xu Tianwen, Yao Yao
Psychological Development Guidance Center, School of Educational Sciences, Quanzhou Normal College, Quanzhou, China.
Professional College of Arts and Tourism, Hyogo Public University Corporation, Hyogo, Japan.
Front Psychol. 2025 Apr 9;15:1459446. doi: 10.3389/fpsyg.2024.1459446. eCollection 2024.
Emotion recognition plays a crucial role in understanding decision-making processes, as emotional stimuli significantly influence individuals' choices. However, existing emotion recognition systems face challenges in handling complex natural environments, diverse emotional expressions, and limited data availability, hampering their effectiveness and widespread adoption. To address these issues, we propose an Enhanced GhostNet with Transformer Encoder (EGT) model that leverages deep learning techniques for robust emotion recognition through facial expressions. The EGT model integrates GhostNet's efficient feature extraction, the Transformer's ability to capture global context, and a dual attention mechanism to selectively enhance critical features. Experimental results show that the EGT model achieves an accuracy of 89.3% on the RAF-DB dataset and 85.7% on the AffectNet dataset, outperforming current state-of-the-art lightweight models. These results indicate the model's capability to recognize various emotional states with high confidence, even in challenging and noisy environments. Our model's improved accuracy and robustness in emotion recognition can enhance intelligent human-computer interaction systems, personalized recommendation systems, and mental health monitoring tools. This research underscores the potential of advanced deep learning techniques to significantly improve emotion recognition systems, providing better user experiences and more informed decision-making processes.
情绪识别在理解决策过程中起着至关重要的作用,因为情绪刺激会显著影响个体的选择。然而,现有的情绪识别系统在处理复杂的自然环境、多样的情绪表达以及有限的数据可用性方面面临挑战,这阻碍了它们的有效性和广泛应用。为了解决这些问题,我们提出了一种带有Transformer编码器的增强型GhostNet(EGT)模型,该模型利用深度学习技术通过面部表情进行强大的情绪识别。EGT模型集成了GhostNet的高效特征提取、Transformer捕捉全局上下文的能力以及一种双重注意力机制,以选择性地增强关键特征。实验结果表明,EGT模型在RAF-DB数据集上的准确率达到89.3%,在AffectNet数据集上的准确率达到85.7%,优于当前最先进的轻量级模型。这些结果表明该模型即使在具有挑战性和嘈杂的环境中也能够高度自信地识别各种情绪状态。我们的模型在情绪识别方面提高的准确率和鲁棒性可以增强智能人机交互系统、个性化推荐系统和心理健康监测工具。这项研究强调了先进的深度学习技术在显著改进情绪识别系统方面的潜力,从而提供更好的用户体验和更明智的决策过程。