Li Shiwei, Wang Jisen, Tian Linbo, Wang Jianqiang, Huang Yan
School of Traffic and Transportation, Lanzhou Jiaotong University, Lanzhou, 730070, China.
Key Laboratory of Railway Industry on Plateau Railway Transportation Intelligent Management and Control, Lanzhou, 730070, China.
Sci Rep. 2025 Feb 20;15(1):6153. doi: 10.1038/s41598-025-90440-2.
Emotion, a fundamental mapping of human responses to external stimuli, has been extensively studied in human-computer interaction, particularly in areas such as intelligent cockpits and systems. However, accurately recognizing emotions from facial expressions remains a significant challenge due to lighting conditions, posture, and micro-expressions. Emotion recognition using global or local facial features is a key research direction. However, relying solely on global or local features often results in models that exhibit uneven attention across facial features, neglecting key variations critical for detecting emotional changes. This paper proposes a method for modeling and extracting key facial features by integrating global and local facial data. First, we construct a comprehensive image preprocessing model that includes super-resolution processing, lighting and shading processing, and texture enhancement. This preprocessing step significantly enriches the expression of image features. Second, A global facial feature recognition model is developed using an encoder-decoder architecture, which effectively eliminates environmental noise and generates a comprehensive global feature dataset for facial analysis. Simultaneously, the Haar cascade classifier is employed to extract refined features from key facial regions, including the eyes, mouth, and overall face, resulting in a corresponding local feature dataset. Finally, a two-branch convolutional neural network is designed to integrate both global and local facial feature datasets, enhancing the model's ability to recognize facial characteristics accurately. The global feature branch fully characterizes the global features of the face, while the local feature branch focuses on the local features. An adaptive fusion module integrates the global and local features, enhancing the model's ability to differentiate subtle emotional changes. To evaluate the accuracy and robustness of the model, we train and test it on the FER-2013 and JAFFE emotion datasets, achieving average accuracies of 80.59% and 97.61%, respectively. Compared to existing state-of-the-art models, our refined face feature extraction and fusion model demonstrates superior performance in emotion recognition. Additionally, the comparative analysis shows that emotional features across different faces show similarities. Building on psychological research, we categorize the dataset into three emotion classes: positive, neutral, and negative. The accuracy of emotion recognition is significantly improved under the new classification criteria. Additionally, the self-built dataset is used to validate further that this classification approach has important implications for practical applications.
情感作为人类对外部刺激的基本反应映射,已在人机交互领域得到广泛研究,尤其是在智能驾驶舱和系统等领域。然而,由于光照条件、姿势和微表情等因素,从面部表情中准确识别情感仍然是一项重大挑战。使用全局或局部面部特征进行情感识别是一个关键的研究方向。然而,仅依赖全局或局部特征往往会导致模型对面部特征的关注不均衡,忽略了检测情感变化至关重要的关键变化。本文提出了一种通过整合全局和局部面部数据来建模和提取关键面部特征的方法。首先,我们构建了一个综合图像预处理模型,包括超分辨率处理、光照和阴影处理以及纹理增强。这一预处理步骤显著丰富了图像特征的表达。其次,使用编码器 - 解码器架构开发了一个全局面部特征识别模型,该模型有效地消除了环境噪声,并生成了用于面部分析的综合全局特征数据集。同时,采用哈尔级联分类器从包括眼睛、嘴巴和整个面部在内的关键面部区域提取精细特征,从而得到相应的局部特征数据集。最后,设计了一个双分支卷积神经网络来整合全局和局部面部特征数据集,增强模型准确识别面部特征的能力。全局特征分支全面表征面部的全局特征,而局部特征分支专注于局部特征。一个自适应融合模块整合全局和局部特征,增强模型区分细微情感变化的能力。为了评估模型的准确性和鲁棒性,我们在FER - 2013和JAFFE情感数据集上对其进行训练和测试,分别达到了80.59%和97.61%的平均准确率。与现有的最先进模型相比,我们改进的面部特征提取和融合模型在情感识别方面表现出卓越的性能。此外,对比分析表明不同面部的情感特征存在相似性。基于心理学研究,我们将数据集分为三个情感类别:积极、中性和消极。在新的分类标准下,情感识别的准确率显著提高。此外,使用自建数据集进一步验证了这种分类方法对实际应用具有重要意义。