Chen Yuekun, Liu Shuaishi, Zhao Dongxu, Ji Wenkai
School of Electrical and Electronic Engineering, Changchun University of Technology, Changchun, China.
Front Neurorobot. 2023 Aug 17;17:1250706. doi: 10.3389/fnbot.2023.1250706. eCollection 2023.
Recognizing occluded facial expressions in the wild poses a significant challenge. However, most previous approaches rely solely on either global or local feature-based methods, leading to the loss of relevant expression features. To address these issues, a feature fusion residual attention network (FFRA-Net) is proposed. FFRA-Net consists of a multi-scale module, a local attention module, and a feature fusion module. The multi-scale module divides the intermediate feature map into several sub-feature maps in an equal manner along the channel dimension. Then, a convolution operation is applied to each of these feature maps to obtain diverse global features. The local attention module divides the intermediate feature map into several sub-feature maps along the spatial dimension. Subsequently, a convolution operation is applied to each of these feature maps, resulting in the extraction of local key features through the attention mechanism. The feature fusion module plays a crucial role in integrating global and local expression features while also establishing residual links between inputs and outputs to compensate for the loss of fine-grained features. Last, two occlusion expression datasets (FM_RAF-DB and SG_RAF-DB) were constructed based on the RAF-DB dataset. Extensive experiments demonstrate that the proposed FFRA-Net achieves excellent results on four datasets: FM_RAF-DB, SG_RAF-DB, RAF-DB, and FERPLUS, with accuracies of 77.87%, 79.50%, 88.66%, and 88.97%, respectively. Thus, the approach presented in this paper demonstrates strong applicability in the context of occluded facial expression recognition (FER).
在自然环境中识别被遮挡的面部表情是一项重大挑战。然而,大多数先前的方法仅依赖于基于全局或局部特征的方法,导致相关表情特征的丢失。为了解决这些问题,提出了一种特征融合残差注意力网络(FFRA-Net)。FFRA-Net由一个多尺度模块、一个局部注意力模块和一个特征融合模块组成。多尺度模块沿通道维度将中间特征图等分为几个子特征图。然后,对每个这些特征图应用卷积操作以获得多样的全局特征。局部注意力模块沿空间维度将中间特征图划分为几个子特征图。随后,对每个这些特征图应用卷积操作,通过注意力机制提取局部关键特征。特征融合模块在整合全局和局部表情特征方面起着关键作用,同时还在输入和输出之间建立残差链接以补偿细粒度特征的损失。最后,基于RAF-DB数据集构建了两个遮挡表情数据集(FM_RAF-DB和SG_RAF-DB)。大量实验表明,所提出的FFRA-Net在四个数据集上取得了优异的结果:FM_RAF-DB、SG_RAF-DB、RAF-DB和FERPLUS,准确率分别为77.87%、79.50%、88.66%和88.97%。因此,本文提出的方法在遮挡面部表情识别(FER)的背景下显示出强大的适用性。