Institute of Smart Systems and Artificial Intelligence, Nazarbayev University, Astana 010000, Kazakhstan.
Sensors (Basel). 2024 Sep 15;24(18):5993. doi: 10.3390/s24185993.
Accurate face detection and subsequent localization of facial landmarks are mandatory steps in many computer vision applications, such as emotion recognition, age estimation, and gender identification. Thanks to advancements in deep learning, numerous facial applications have been developed for human faces. However, most have to employ multiple models to accomplish several tasks simultaneously. As a result, they require more memory usage and increased inference time. Also, less attention is paid to other domains, such as animals and cartoon characters. To address these challenges, we propose an input-agnostic face model, AnyFace++, to perform multiple face-related tasks concurrently. The tasks are face detection and prediction of facial landmarks for human, animal, and cartoon faces, including age estimation, gender classification, and emotion recognition for human faces. We trained the model using deep multi-task, multi-domain learning with a heterogeneous cost function. The experimental results demonstrate that AnyFace++ generates outcomes comparable to cutting-edge models designed for specific domains.
准确的人脸检测和随后的面部地标定位是许多计算机视觉应用的必要步骤,例如情感识别、年龄估计和性别识别。由于深度学习的进步,已经开发了许多用于人脸的面部应用程序。然而,大多数应用程序都必须使用多个模型来同时完成多项任务。因此,它们需要更多的内存使用和增加的推理时间。此外,对其他领域(如动物和卡通人物)的关注较少。为了解决这些挑战,我们提出了一种与输入无关的人脸模型 AnyFace++,以同时执行多个与人脸相关的任务。这些任务包括人脸检测和预测人脸地标,用于人类、动物和卡通人脸,包括人类的年龄估计、性别分类和情感识别。我们使用深度多任务、多域学习和异构代价函数对模型进行了训练。实验结果表明,AnyFace++生成的结果可与专门针对特定领域的最先进模型相媲美。