Zhuhai Da Heng Qin Technology Development Co., Ltd., Zhuhai 519000, China.
Faculty of Innovation Engineering, Macau University of Science and Technology, Macau 999078, China.
Sensors (Basel). 2023 Jan 30;23(3):1532. doi: 10.3390/s23031532.
Face alignment is widely used in high-level face analysis applications, such as human activity recognition and human-computer interaction. However, most existing models involve a large number of parameters and are computationally inefficient in practical applications. In this paper, we aim to build a lightweight facial landmark detector by proposing a network-level architecture-slimming method. Concretely, we introduce a selective feature fusion mechanism to quantify and prune redundant transformation and aggregation operations in a high-resolution supernetwork. Moreover, we develop a triple knowledge distillation scheme to further refine a slimmed network, where two peer student networks could learn the implicit landmark distributions from each other while absorbing the knowledge from a teacher network. Extensive experiments on challenging benchmarks, including 300W, COFW, and WFLW, demonstrate that our approach achieves competitive performance with a better trade-off between the number of parameters (0.98 M-1.32 M) and the number of floating-point operations (0.59 G-0.6 G) when compared to recent state-of-the-art methods.
人脸对齐广泛应用于高级人脸分析应用,例如人类活动识别和人机交互。然而,大多数现有模型涉及大量参数,在实际应用中计算效率不高。在本文中,我们旨在通过提出一种网络级别的架构瘦身方法来构建轻量级的面部地标检测器。具体来说,我们引入了一种选择性特征融合机制,以量化和修剪高分辨率超网络中的冗余变换和聚合操作。此外,我们开发了三重知识蒸馏方案,进一步细化瘦身网络,其中两个对等的学生网络可以相互学习隐含地标分布,同时从教师网络中吸收知识。在具有挑战性的基准测试(包括 300W、COFW 和 WFLW)上进行的广泛实验表明,与最近的最先进方法相比,我们的方法在参数数量(0.98 M-1.32 M)和浮点运算数量(0.59 G-0.6 G)之间实现了更好的权衡,达到了有竞争力的性能。