Kong Weiyi, You Zhisheng, Lv Xuebin
National Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University, Chengdu, 610065, PR China.
School of Computer Science, Sichuan University, Chengdu, 610064, PR China.
Comput Commun. 2023 Feb 1;199:30-41. doi: 10.1016/j.comcom.2022.12.011. Epub 2022 Dec 13.
Under the normalization of epidemic control in COVID-19, it is essential to realize fast and high-precision face recognition without feeling for epidemic prevention and control. This paper proposes an innovative Laplacian pyramid algorithm for deep 3D face recognition, which can be used in public. Through multi-mode fusion, dense 3D alignment and multi-scale residual fusion are ensured. Firstly, the 2D to 3D structure representation method is used to fully correlate the information of crucial points, and dense alignment modeling is carried out. Then, based on the 3D critical point model, a five-layer Laplacian depth network is constructed. High-precision recognition can be achieved by multi-scale and multi-modal mapping and reconstruction of 3D face depth images. Finally, in the training process, the multi-scale residual weight is embedded into the loss function to improve the network's performance. In addition, to achieve high real-time performance, our network is designed in an end-to-end cascade. While ensuring the accuracy of identification, it guarantees personnel screening under the normalization of epidemic control. This ensures fast and high-precision face recognition and establishes a 3D face database. This method is adaptable and robust in harsh, low light, and noise environments. Moreover, it can complete face reconstruction and recognize various skin colors and postures.
在新冠肺炎疫情防控常态化情况下,实现无感防疫防控的快速、高精度人脸识别至关重要。本文提出一种创新的用于深度三维人脸识别的拉普拉斯金字塔算法,可用于公共场所。通过多模态融合,确保了密集三维对齐和多尺度残差融合。首先,采用二维到三维结构表示方法充分关联关键点信息,并进行密集对齐建模。然后,基于三维关键点模型构建五层拉普拉斯深度网络。通过对三维人脸深度图像进行多尺度、多模态映射和重建,可实现高精度识别。最后,在训练过程中,将多尺度残差权重嵌入损失函数以提升网络性能。此外,为实现高实时性,我们的网络采用端到端级联设计。在确保识别准确率的同时,保证疫情防控常态化下的人员筛查。这确保了快速、高精度的人脸识别,并建立了三维人脸数据库。该方法在恶劣、低光照和噪声环境下具有适应性和鲁棒性。此外,它还能完成人脸重建,并识别各种肤色和姿态。