Suppr超能文献

基于自我监督学习与集成先验约束的面部动作单元表示。

Facial Action Unit Representation Based on Self-Supervised Learning With Ensembled Priori Constraints.

出版信息

IEEE Trans Image Process. 2024;33:5045-5059. doi: 10.1109/TIP.2024.3446250. Epub 2024 Sep 17.

Abstract

Facial action units (AUs) focus on a comprehensive set of atomic facial muscle movements for human expression understanding. Based on supervised learning, discriminative AU representation can be achieved from local patches where the AUs are located. Unfortunately, accurate AU localization and characterization are challenged by the tremendous manual annotations, which limits the performance of AU recognition in realistic scenarios. In this study, we propose an end-to-end self-supervised AU representation learning model (SsupAU) to learn AU representations from unlabeled facial videos. Specifically, the input face is decomposed into six components using auto-encoders: five photo-geometric meaningful components, together with 2D flow field AUs. By constructing the canonical neutral face, posed neutral face, and posed expressional face gradually, these components can be disentangled without supervision, therefore the AU representations can be learned. To construct the canonical neutral face without manually labeled ground truth of emotion state or AU intensity, two priori knowledge based assumptions are proposed: 1) identity consistency, which explores the identical albedos and depths of different frames in a face video, and helps to learn the camera color mode as an extra cue for canonical neutral face recovery. 2) average face, which enables the model to discover a 'neutral facial expression' of the canonical neutral face and decouple the AUs in representation learning. To the best of our knowledge, this is the first attempt to design self-supervised AU representation learning method based on the definition of AUs. Substantial experiments on benchmark datasets have demonstrated the superior performance of the proposed work in comparison to other state-of-the-art approaches, as well as an outstanding capability of decomposing input face into meaningful factors for its reconstruction. The code is made available at https://github.com/Sunner4nwpu/SsupAU.

摘要

面部动作单元 (AUs) 专注于全面的原子级面部肌肉运动,用于人类表情理解。基于监督学习,可以从 AU 所在的局部斑块中实现有区别的 AU 表示。不幸的是,由于需要大量的手动标注,准确的 AU 定位和特征描述受到了挑战,这限制了 AU 识别在现实场景中的性能。在这项研究中,我们提出了一种端到端的自监督 AU 表示学习模型 (SsupAU),用于从未标记的面部视频中学习 AU 表示。具体来说,输入的面部使用自动编码器分解成六个组件:五个具有摄影几何意义的组件,以及 2D 流场 AU。通过逐步构建规范中性脸、姿势中性脸和姿势表情脸,可以在没有监督的情况下解耦这些组件,从而学习 AU 表示。为了在没有手动标记的情感状态或 AU 强度的情况下构建规范中性脸,我们提出了两个基于先验知识的假设:1) 身份一致性,探索面部视频中不同帧的相同反射率和深度,有助于学习相机颜色模式,作为规范中性脸恢复的额外线索。2) 平均脸,使模型能够发现规范中性脸的“中性面部表情”,并在表示学习中解耦 AU。据我们所知,这是首次尝试基于 AU 的定义设计自监督 AU 表示学习方法。在基准数据集上的大量实验表明,与其他最先进的方法相比,所提出的工作具有更好的性能,并且具有将输入面部分解为有意义的因素进行重建的出色能力。代码可在 https://github.com/Sunner4nwpu/SsupAU 上获得。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验