IEEE Trans Neural Netw Learn Syst. 2023 Dec;34(12):10875-10888. doi: 10.1109/TNNLS.2022.3171604. Epub 2023 Nov 30.
The existing occlusion face recognition algorithms almost tend to pay more attention to the visible facial components. However, these models are limited because they heavily rely on existing face segmentation approaches to locate occlusions, which is extremely sensitive to the performance of mask learning. To tackle this issue, we propose a joint segmentation and identification feature learning framework for end-to-end occlusion face recognition. More particularly, unlike employing an external face segmentation model to locate the occlusion, we design an occlusion prediction module supervised by known mask labels to be aware of the mask. It shares underlying convolutional feature maps with the identification network and can be collaboratively optimized with each other. Furthermore, we propose a novel channel refinement network to cast the predicted single-channel occlusion mask into a multi-channel mask matrix with each channel owing a distinct mask map. Occlusion-free feature maps are then generated by projecting multi-channel mask probability maps onto original feature maps. Thus, it can suppress the representation of occlusion elements in both the spatial and channel dimensions under the guidance of the mask matrix. Moreover, in order to avoid misleading aggressively predicted mask maps and meanwhile actively exploit usable occlusion-robust features, we aggregate the original and occlusion-free feature maps to distill the final candidate embeddings by our proposed feature purification module. Lastly, to alleviate the scarcity of real-world occlusion face recognition datasets, we build large-scale synthetic occlusion face datasets, totaling up to 980193 face images of 10574 subjects for the training dataset and 36721 face images of 6817 subjects for the testing dataset, respectively. Extensive experimental results on the synthetic and real-world occlusion face datasets show that our approach significantly outperforms the state-of-the-art in both 1:1 face verification and 1:N face identification.
现有的遮挡人脸识别算法几乎都倾向于更多地关注可见的面部成分。然而,这些模型是有限的,因为它们严重依赖现有的面部分割方法来定位遮挡,这对掩模学习的性能非常敏感。为了解决这个问题,我们提出了一种用于端到端遮挡人脸识别的联合分割和识别特征学习框架。更具体地说,我们没有采用外部面部分割模型来定位遮挡,而是设计了一个遮挡预测模块,由已知的掩模标签监督,以了解掩模。它与识别网络共享底层卷积特征图,并可以相互协作进行优化。此外,我们提出了一种新的通道细化网络,将预测的单通道遮挡掩模转换为具有每个通道具有不同掩模图的多通道掩模矩阵。然后,通过将多通道掩模概率图投影到原始特征图上,生成无遮挡特征图。因此,它可以在掩模矩阵的指导下,在空间和通道维度上抑制遮挡元素的表示。此外,为了避免误导性地预测掩模图,并同时积极利用可用的遮挡稳健特征,我们通过我们提出的特征净化模块聚合原始和无遮挡特征图,以提取最终的候选嵌入。最后,为了缓解现实世界遮挡人脸识别数据集的稀缺性,我们分别构建了大规模的合成遮挡人脸数据集,训练数据集总共有 10574 名受试者的 980193 张人脸图像,测试数据集有 6817 名受试者的 36721 张人脸图像。在合成和现实世界遮挡人脸数据集上的广泛实验结果表明,我们的方法在 1:1 人脸验证和 1:N 人脸识别方面都明显优于最先进的方法。