基于微表情的情感识别：使用瀑布型空洞空间金字塔池化网络。

Micro-Expression-Based Emotion Recognition Using Waterfall Atrous Spatial Pyramid Pooling Networks.

机构信息

Department of Electrical, Electronic and Systems Engineering, Faculty of Engineering and Built Environment, Universiti Kebangsaan Malaysia, Bangi 43600, Selangor, Malaysia.

出版信息

Sensors (Basel). 2022 Jun 19;22(12):4634. doi: 10.3390/s22124634.

DOI:10.3390/s22124634

PMID:35746417

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9227116/

Abstract

Understanding a person's attitude or sentiment from their facial expressions has long been a straightforward task for humans. Numerous methods and techniques have been used to classify and interpret human emotions that are commonly communicated through facial expressions, with either macro- or micro-expressions. However, performing this task using computer-based techniques or algorithms has been proven to be extremely difficult, whereby it is a time-consuming task to annotate it manually. Compared to macro-expressions, micro-expressions manifest the real emotional cues of a human, which they try to suppress and hide. Different methods and algorithms for recognizing emotions using micro-expressions are examined in this research, and the results are presented in a comparative approach. The proposed technique is based on a multi-scale deep learning approach that aims to extract facial cues of various subjects under various conditions. Then, two popular multi-scale approaches are explored, Spatial Pyramid Pooling (SPP) and Atrous Spatial Pyramid Pooling (ASPP), which are then optimized to suit the purpose of emotion recognition using micro-expression cues. There are four new architectures introduced in this paper based on multi-layer multi-scale convolutional networks using both direct and waterfall network flows. The experimental results show that the ASPP module with waterfall network flow, which we coined as WASPP-Net, outperforms the state-of-the-art benchmark techniques with an accuracy of 80.5%. For future work, a high-resolution approach to multi-scale approaches can be explored to further improve the recognition performance.

摘要

从人的面部表情理解一个人的态度或情绪一直是人类的一项直接任务。已经使用了许多方法和技术来分类和解释通常通过面部表情传达的人类情感，无论是宏观表情还是微观表情。然而，使用基于计算机的技术或算法执行此任务已被证明非常困难，因此手动注释它是一项耗时的任务。与宏观表情相比，微观表情表现出人类真实的情感线索，他们试图抑制和隐藏这些线索。本研究检查了使用微观表情识别情绪的不同方法和算法，并以比较的方式呈现结果。所提出的技术基于一种多尺度深度学习方法，旨在提取各种条件下各种对象的面部线索。然后，探索了两种流行的多尺度方法，空间金字塔池化（SPP）和空洞空间金字塔池化（ASPP），然后对其进行优化，以适应使用微观表情线索识别情绪的目的。本文介绍了四种新的基于多层多尺度卷积网络的架构，使用了直接和瀑布网络流。实验结果表明，我们称之为 WASPP-Net 的具有瀑布网络流的 ASPP 模块在准确性方面优于最先进的基准技术，达到 80.5%。未来的工作可以探索高分辨率的多尺度方法，以进一步提高识别性能。