Yang Hongling, Xie Lun, Pan Hang, Li Chiqin, Wang Zhiliang, Zhong Jialiang
Department of Computer Science, Changzhi University, Changzhi 046011, China.
School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China.
Entropy (Basel). 2023 Aug 22;25(9):1246. doi: 10.3390/e25091246.
The emotional changes in facial micro-expressions are combinations of action units. The researchers have revealed that action units can be used as additional auxiliary data to improve facial micro-expression recognition. Most of the researchers attempt to fuse image features and action unit information. However, these works ignore the impact of action units on the facial image feature extraction process. Therefore, this paper proposes a local detail feature enhancement model based on a multimodal dynamic attention fusion network (MADFN) method for micro-expression recognition. This method uses a masked autoencoder based on learnable class tokens to remove local areas with low emotional expression ability in micro-expression images. Then, we utilize the action unit dynamic fusion module to fuse action unit representation to improve the potential representation ability of image features. The state-of-the-art performance of our proposed model is evaluated and verified on SMIC, CASME II, SAMM, and their combined 3DB-Combined datasets. The experimental results demonstrated that the proposed model achieved competitive performance with accuracy rates of 81.71%, 82.11%, and 77.21% on SMIC, CASME II, and SAMM datasets, respectively, that show the MADFN model can help to improve the discrimination of facial image emotional features.
面部微表情中的情绪变化是动作单元的组合。研究人员已经表明,动作单元可以用作额外的辅助数据来提高面部微表情识别能力。大多数研究人员试图融合图像特征和动作单元信息。然而,这些工作忽略了动作单元对面部图像特征提取过程的影响。因此,本文提出了一种基于多模态动态注意力融合网络(MADFN)方法的局部细节特征增强模型用于微表情识别。该方法使用基于可学习类别令牌的掩码自动编码器去除微表情图像中情绪表达能力较低的局部区域。然后,我们利用动作单元动态融合模块融合动作单元表示,以提高图像特征的潜在表示能力。我们在SMIC、CASME II、SAMM及其组合的3DB-Combined数据集上评估并验证了我们提出的模型的最先进性能。实验结果表明,所提出的模型在SMIC、CASME II和SAMM数据集上分别取得了81.71%、82.11%和77.21%的准确率,具有竞争力,这表明MADFN模型有助于提高面部图像情绪特征的辨别力。