Suppr超能文献

基于多速率特征融合方案的鲁棒面部表情识别算法。

A Robust Facial Expression Recognition Algorithm Based on Multi-Rate Feature Fusion Scheme.

机构信息

Department of IT Engineering, Sookmyung Women's University, 100 Chungpa-ro 47 gil, Yongsna-gu, Seoul 04310, Korea.

La Trobe Cybersecurity Research Hub, La Trobe University, Melbourne, VIC 3086, Australia.

出版信息

Sensors (Basel). 2021 Oct 20;21(21):6954. doi: 10.3390/s21216954.

Abstract

In recent years, the importance of catching humans' emotions grows larger as the artificial intelligence (AI) field is being developed. Facial expression recognition (FER) is a part of understanding the emotion of humans through facial expressions. We proposed a robust multi-depth network that can efficiently classify the facial expression through feeding various and reinforced features. We designed the inputs for the multi-depth network as minimum overlapped frames so as to provide more spatio-temporal information to the designed multi-depth network. To utilize a structure of a multi-depth network, a multirate-based 3D convolutional neural network (CNN) based on a multirate signal processing scheme was suggested. In addition, we made the input images to be normalized adaptively based on the intensity of the given image and reinforced the output features from all depth networks by the self-attention module. Then, we concatenated the reinforced features and classified the expression by a joint fusion classifier. Through the proposed algorithm, for the CK+ database, the result of the proposed scheme showed a comparable accuracy of 96.23%. For the MMI and the GEMEP-FERA databases, it outperformed other state-of-the-art models with accuracies of 96.69% and 99.79%. For the AFEW database, which is known as one in a very wild environment, the proposed algorithm achieved an accuracy of 31.02%.

摘要

近年来,随着人工智能(AI)领域的发展,捕捉人类情感的重要性越来越大。面部表情识别(FER)是通过面部表情理解人类情感的一部分。我们提出了一种强大的多深度网络,通过馈送各种增强的特征,可以有效地对面部表情进行分类。我们设计了多深度网络的输入为最小重叠帧,以便为设计的多深度网络提供更多的时空信息。为了利用多深度网络的结构,我们提出了一种基于多速率信号处理方案的多速率 3D 卷积神经网络(CNN)。此外,我们根据给定图像的强度自适应地对输入图像进行归一化,并通过自注意力模块增强来自所有深度网络的输出特征。然后,我们将增强后的特征进行串联,并通过联合融合分类器对其进行分类。通过所提出的算法,对于 CK+ 数据库,所提出方案的结果显示出可比较的 96.23%的准确率。对于 MMI 和 GEMEP-FERA 数据库,它的准确率分别为 96.69%和 99.79%,优于其他最先进的模型。对于 AFEW 数据库,它被认为是一个非常恶劣的环境,所提出的算法实现了 31.02%的准确率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/32d0/8587878/1b453b7152ef/sensors-21-06954-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验