用于自然环境下面部表情识别的多损失、特征融合及改进的前两名投票集成方法

Multi-loss, feature fusion and improved top-two-voting ensemble for facial expression recognition in the wild.

作者信息

Zhou Guangyao, Xie Yuanlun, Fu Yiqin, Wang Zhaokun

机构信息

School of Computing and Artificial Intelligence, Southwest Jiaotong University, China.

School of Information and Software Engineering, University of Electronic Science and Technology of China, China.

出版信息

Neural Netw. 2025 Mar;183:106937. doi: 10.1016/j.neunet.2024.106937. Epub 2024 Nov 26.

DOI:10.1016/j.neunet.2024.106937

PMID:39615451

Abstract

Facial expression recognition (FER) in the wild is a challenging pattern recognition task affected by the images' low quality and has attracted broad interest in computer vision. Existing FER methods failed to obtain sufficient accuracy to support the practical applications, especially in scenarios with low fault tolerance, which limits the adaptability of FER. Targeting exploring the possibility of further improving the accuracy of FER in the wild, this paper proposes a novel single model named R18+FAML and an ensemble model named R18+FAML-FGA-T2V, which applies intra-feature fusion within a single network, feature fusion among multiple networks, and the ensemble decision strategy. Based on the backbone of ResNet18 (R18), R18+FAML combines internal feature fusion and three attention blocks, as well as uses multiple loss functions (FAML) to improve the diversity of the feature extraction. To effectively integrate feature extractors from multiple networks, we propose feature fusion among networks based on the genetic algorithm (FGA). Comprehensively considering and utilizing more classification information, we propose an ensemble strategy, i.e., the improved top-two-voting (T2V) of multiple networks with the same structure. Combining the above strategies, R18+FAML-FGA-T2V can focus on the main expression-aware areas by integrating interest areas of multiple networks. From experiments on three challenging FER datasets in the wild including RAF-DB, AffectNet-8 and AffectNet-7, our single model R18+FAML and ensemble model R18+FAML-FGA-T2V achieve the accuracies of 90.32,62.17,65.83% and 91.59,63.27,66.63% respectively, both achieving the state-of-the-art results.

摘要

野外面部表情识别（FER）是一项具有挑战性的模式识别任务，受图像质量低的影响，在计算机视觉领域引起了广泛关注。现有的FER方法未能获得足够的准确率来支持实际应用，尤其是在容错率低的场景中，这限制了FER的适应性。为了探索进一步提高野外FER准确率的可能性，本文提出了一种名为R18+FAML的新型单模型和一种名为R18+FAML-FGA-T2V的集成模型，该集成模型在单个网络内应用特征内融合、多个网络间的特征融合以及集成决策策略。基于ResNet18（R18）的骨干网络，R18+FAML结合了内部特征融合和三个注意力块，并使用多个损失函数（FAML）来提高特征提取的多样性。为了有效整合来自多个网络的特征提取器，我们提出了基于遗传算法（FGA）的网络间特征融合。综合考虑并利用更多的分类信息，我们提出了一种集成策略，即对具有相同结构的多个网络进行改进的前两名投票（T2V）。结合上述策略，R18+FAML-FGA-T2V可以通过整合多个网络的感兴趣区域来聚焦主要的表情感知区域。在包括RAF-DB、AffectNet-8和AffectNet-7在内的三个具有挑战性的野外FER数据集上的实验表明，我们的单模型R18+FAML和集成模型R18+FAML-FGA-T2V分别达到了90.32%、62.17%、65.83%和91.59%、63.27%、66.63%的准确率，均取得了当前最优的结果。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于自然环境下面部表情识别的多损失、特征融合及改进的前两名投票集成方法

Multi-loss, feature fusion and improved top-two-voting ensemble for facial expression recognition in the wild.

作者信息

机构信息

出版信息

相似文献

用于自然环境下面部表情识别的多损失、特征融合及改进的前两名投票集成方法

Multi-loss, feature fusion and improved top-two-voting ensemble for facial expression recognition in the wild.

作者信息

机构信息

出版信息

相似文献