Perveen Nazil, Roy Debaditya, Mohan C Krishna
IEEE Trans Image Process. 2020 Jul 30;PP. doi: 10.1109/TIP.2020.3011846.
Recognition of facial expressions across various actors, contexts, and recording conditions in real-world videos involves identifying local facial movements. Hence, it is important to discover the formation of expressions from local representations captured from different parts of the face. So in this paper, we propose a dynamic kernel-based representation for facial expressions that assimilates facial movements captured using local spatio-temporal representations in a large universal Gaussian mixture model (uGMM). These dynamic kernels are used to preserve local similarities while handling global context changes for the same expression by utilizing the statistics of uGMM. We demonstrate the efficacy of dynamic kernel representation using three different dynamic kernels, namely, explicit mapping based, probability-based, and matching-based, on three standard facial expression datasets, namely, MMI, AFEW, and BP4D. Our evaluations show that probability-based kernels are the most discriminative among the dynamic kernels. However, in terms of computational complexity, intermediate matching kernels are more efficient as compared to the other two representations.
在真实世界视频中识别不同演员、场景和录制条件下的面部表情,需要识别局部面部动作。因此,从面部不同部位捕获的局部表示中发现表情的形成非常重要。所以在本文中,我们提出了一种基于动态核的面部表情表示方法,该方法在一个大型通用高斯混合模型(uGMM)中,将使用局部时空表示捕获的面部动作进行同化。这些动态核用于通过利用uGMM的统计信息来保持局部相似性,同时处理同一表情的全局上下文变化。我们在三个标准面部表情数据集,即MMI、AFEW和BP4D上,使用三种不同的动态核,即基于显式映射、基于概率和基于匹配的动态核,展示了动态核表示的有效性。我们的评估表明,基于概率的核在动态核中最具判别力。然而,在计算复杂度方面,与其他两种表示相比,中间匹配核更有效。