Ding Xiaoyu, Chu Wen-Sheng, De la Torre Fernando, Cohn Jeffery F, Wang Qiao
School of Information Science and Engineering, Southeast University, Nanjing, China.
Robotics Institute, Carnegie Mellon University, Pittsburgh, PA 15213.
Proc IEEE Int Conf Comput Vis. 2013;2013:2400-2407. doi: 10.1109/ICCV.2013.298.
Automatic facial Action Unit (AU) detection from video is a long-standing problem in facial expression analysis. AU detection is typically posed as a classification problem between frames or segments of positive examples and negative ones, where existing work emphasizes the use of different features or classifiers. In this paper, we propose a method called Cascade of Tasks (CoT) that combines the use of different tasks (i.e., frame, segment and transition) for AU event detection. We train CoT in a sequential manner embracing diversity, which ensures robustness and generalization to unseen data. In addition to conventional frame-based metrics that evaluate frames independently, we propose a new event-based metric to evaluate detection performance at event-level. We show how the CoT method consistently outperforms state-of-the-art approaches in both frame-based and event-based metrics, across three public datasets that differ in complexity: CK+, FERA and RU-FACS.
从视频中自动检测面部动作单元(AU)是面部表情分析中一个长期存在的问题。AU检测通常被视为正例和负例的帧或片段之间的分类问题,现有工作强调使用不同的特征或分类器。在本文中,我们提出了一种名为任务级联(CoT)的方法,该方法结合使用不同的任务(即帧、片段和过渡)来进行AU事件检测。我们以包含多样性的顺序方式训练CoT,这确保了对未见数据的鲁棒性和泛化能力。除了独立评估帧的传统基于帧的指标外,我们还提出了一种新的基于事件的指标来评估事件级别的检测性能。我们展示了CoT方法如何在三个复杂度不同的公共数据集(CK+、FERA和RU-FACS)上,在基于帧和基于事件的指标方面始终优于现有方法。