Sariyanidi Evangelos, Gunes Hatice, Cavallaro Andrea
IEEE Trans Image Process. 2017 Apr;26(4):1965-1978. doi: 10.1109/TIP.2017.2662237. Epub 2017 Feb 1.
The extraction of descriptive features from sequences of faces is a fundamental problem in facial expression analysis. Facial expressions are represented by psychologists as a combination of elementary movements known as action units: each movement is localised and its intensity is specified with a score that is small when the movement is subtle and large when the movement is pronounced. Inspired by this approach, we propose a novel data-driven feature extraction framework that represents facial expression variations as a linear combination of localised basis functions, whose coefficients are proportional to movement intensity. We show that the linear basis functions required by this framework can be obtained by training a sparse linear model with Gabor phase shifts computed from facial videos. The proposed framework addresses generalisation issues that are not addressed by existing learnt representations, and achieves, with the same learning parameters, state-of-the-art results in recognising both posed expressions and spontaneous micro-expressions. This performance is confirmed even when the data used to train the model differ from test data in terms of the intensity of facial movements and frame rate.
从面部序列中提取描述性特征是面部表情分析中的一个基本问题。心理学家将面部表情表示为称为动作单元的基本运动的组合:每个运动都有其位置,并且其强度由一个分数指定,当运动很细微时分数较小,当运动很明显时分数较大。受此方法的启发,我们提出了一种新颖的数据驱动特征提取框架,该框架将面部表情变化表示为局部基函数的线性组合,其系数与运动强度成正比。我们表明,该框架所需的线性基函数可以通过训练一个稀疏线性模型来获得,该模型使用从面部视频计算出的Gabor相移。所提出的框架解决了现有学习表示未解决的泛化问题,并且在相同的学习参数下,在识别摆拍表情和自发微表情方面都取得了领先的结果。即使用于训练模型的数据在面部运动强度和帧率方面与测试数据不同,这种性能也得到了证实。