Jeni László A, Cohn Jeffrey F, De La Torre Fernando
Carnegie Mellon University, Pittsburgh, PA.
Carnegie Mellon University, Pittsburgh, PA ; University of Pittsburgh, Pittsburgh, PA,
Int Conf Affect Comput Intell Interact Workshops. 2013;2013:245-251. doi: 10.1109/ACII.2013.47.
Recognizing facial action units (AUs) is important for situation analysis and automated video annotation. Previous work has emphasized face tracking and registration and the choice of features classifiers. Relatively neglected is the effect of imbalanced data for action unit detection. While the machine learning community has become aware of the problem of skewed data for training classifiers, little attention has been paid to how skew may bias performance metrics. To address this question, we conducted experiments using both simulated classifiers and three major databases that differ in size, type of FACS coding, and degree of skew. We evaluated influence of skew on both threshold metrics (Accuracy, F-score, Cohen's kappa, and Krippendorf's alpha) and rank metrics (area under the receiver operating characteristic (ROC) curve and precision-recall curve). With exception of area under the ROC curve, all were attenuated by skewed distributions, in many cases, dramatically so. While ROC was unaffected by skew, precision-recall curves suggest that ROC may mask poor performance. Our findings suggest that skew is a critical factor in evaluating performance metrics. To avoid or minimize skew-biased estimates of performance, we recommend reporting skew-normalized scores along with the obtained ones.
识别面部动作单元(AUs)对于态势分析和自动视频标注至关重要。先前的工作主要强调面部跟踪与配准以及特征分类器的选择。相对被忽视的是不平衡数据对动作单元检测的影响。虽然机器学习界已经意识到训练分类器时数据倾斜的问题,但对于倾斜如何使性能指标产生偏差却很少有人关注。为了解决这个问题,我们使用模拟分类器以及三个在规模、FACS编码类型和倾斜程度上存在差异的主要数据库进行了实验。我们评估了倾斜对阈值指标(准确率、F分数、科恩卡方系数和克里彭多夫阿尔法系数)和排序指标(接收者操作特征(ROC)曲线下面积和精确率-召回率曲线)的影响。除了ROC曲线下面积外,所有指标都因倾斜分布而衰减,在许多情况下衰减程度显著。虽然ROC不受倾斜影响,但精确率-召回率曲线表明ROC可能掩盖了较差的性能。我们的研究结果表明,倾斜是评估性能指标的一个关键因素。为了避免或最小化倾斜对性能的偏差估计,我们建议在报告所获得的分数时同时报告倾斜归一化分数。