College of Computer Science and Technology, Xi'an University of Science and Technology, Xi'an 710054, China.
Math Biosci Eng. 2024 Mar 1;21(4):5007-5031. doi: 10.3934/mbe.2024221.
In demanding application scenarios such as clinical psychotherapy and criminal interrogation, the accurate recognition of micro-expressions is of utmost importance but poses significant challenges. One of the main difficulties lies in effectively capturing weak and fleeting facial features and improving recognition performance. To address this fundamental issue, this paper proposed a novel architecture based on a multi-scale 3D residual convolutional neural network. The algorithm leveraged a deep 3D-ResNet50 as the skeleton model and utilized the micro-expression optical flow feature map as the input for the network model. Drawing upon the complex spatial and temporal features inherent in micro-expressions, the network incorporated multi-scale convolutional modules of varying sizes to integrate both global and local information. Furthermore, an attention mechanism feature fusion module was introduced to enhance the model's contextual awareness. Finally, to optimize the model's prediction of the optimal solution, a discriminative network structure with multiple output channels was constructed. The algorithm's performance was evaluated using the public datasets SMIC, SAMM, and CASME Ⅱ. The experimental results demonstrated that the proposed algorithm achieves recognition accuracies of 74.6, 84.77 and 91.35% on these datasets, respectively. This substantial improvement in efficiency compared to existing mainstream methods for extracting micro-expression subtle features effectively enhanced micro-expression recognition performance and increased the accuracy of high-precision micro-expression recognition. Consequently, this paper served as an important reference for researchers working on high-precision micro-expression recognition.
在临床心理治疗和刑事审讯等要求苛刻的应用场景中,准确识别微表情至关重要,但也极具挑战性。其中一个主要难点在于如何有效地捕捉微弱且瞬息即逝的面部特征,并提高识别性能。为了解决这个基本问题,本文提出了一种基于多尺度 3D 残差卷积神经网络的新型架构。该算法利用深度 3D-ResNet50 作为骨干模型,并将微表情光流特征图作为网络模型的输入。该网络利用微表情固有的复杂时空特征,整合了不同大小的多尺度卷积模块,以集成全局和局部信息。此外,引入了注意力机制特征融合模块,以增强模型的上下文感知能力。最后,为了优化模型对最优解的预测,构建了具有多个输出通道的判别网络结构。该算法在公开数据集 SMIC、SAMM 和 CASME Ⅱ上进行了性能评估。实验结果表明,该算法在这些数据集上的识别准确率分别达到了 74.6%、84.77%和 91.35%。与现有主流的微表情细微特征提取方法相比,该算法在效率上有了显著提高,有效地提高了微表情识别性能,增加了高精度微表情识别的准确性。因此,本文为从事高精度微表情识别研究的人员提供了重要参考。