Suppr超能文献

基于面部解析和视觉Transformer的口罩感知面部表情识别

Face-mask-aware Facial Expression Recognition based on Face Parsing and Vision Transformer.

作者信息

Yang Bo, Wu Jianming, Ikeda Kazushi, Hattori Gen, Sugano Masaru, Iwasawa Yusuke, Matsuo Yutaka

机构信息

KDDI Research, Inc., 2-1-15 Ohara, Fujimino-shi, Saitama, 356-8502, Japan.

The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-8654 Japan.

出版信息

Pattern Recognit Lett. 2022 Dec;164:173-182. doi: 10.1016/j.patrec.2022.11.004. Epub 2022 Nov 9.

Abstract

As wearing face masks is becoming an embedded practice due to the COVID-19 pandemic, facial expression recognition (FER) that takes face masks into account is now a problem that needs to be solved. In this paper, we propose a face parsing and vision Transformer-based method to improve the accuracy of face-mask-aware FER. First, in order to improve the precision of distinguishing the unobstructed facial region as well as those parts of the face covered by a mask, we re-train a face-mask-aware face parsing model, based on the existing face parsing dataset automatically relabeled with a face mask and pixel label. Second, we propose a vision Transformer with a cross attention mechanism-based FER classifier, capable of taking both occluded and non-occluded facial regions into account and reweigh these two parts automatically to get the best facial expression recognition performance. The proposed method outperforms existing state-of-the-art face-mask-aware FER methods, as well as other occlusion-aware FER methods, on two datasets that contain three kinds of emotions (M-LFW-FER and M-KDDI-FER datasets) and two datasets that contain seven kinds of emotions (M-FER-2013 and M-CK+ datasets).

摘要

由于新冠疫情,佩戴口罩已成为一种普遍做法,因此考虑到口罩因素的面部表情识别(FER)成为了一个亟待解决的问题。在本文中,我们提出了一种基于面部解析和视觉Transformer的方法,以提高对戴口罩面部表情的识别准确率。首先,为了提高区分未被遮挡的面部区域以及被口罩覆盖的面部区域的精度,我们基于现有的面部解析数据集(该数据集已使用口罩和像素标签自动重新标注)重新训练了一个考虑口罩因素的面部解析模型。其次,我们提出了一种基于交叉注意力机制的视觉Transformer面部表情识别分类器,该分类器能够同时考虑被遮挡和未被遮挡的面部区域,并自动对这两部分进行重新加权,以获得最佳的面部表情识别性能。在包含三种情绪的两个数据集(M-LFW-FER和M-KDDI-FER数据集)以及包含七种情绪的两个数据集(M-FER-2013和M-CK+数据集)上,所提出的方法优于现有的最先进的考虑口罩因素的面部表情识别方法以及其他考虑遮挡因素的面部表情识别方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c98e/9645067/27d3cc082622/gr1_lrg.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验