IEEE Trans Pattern Anal Mach Intell. 2020 Nov;42(11):2926-2943. doi: 10.1109/TPAMI.2019.2916881. Epub 2019 May 14.
Given a tiny face image, existing face hallucination methods aim at super-resolving its high-resolution (HR) counterpart by learning a mapping from an exemplary dataset. Since a low-resolution (LR) input patch may correspond to many HR candidate patches, this ambiguity may lead to distorted HR facial details and wrong attributes such as gender reversal and rejuvenation. An LR input contains low-frequency facial components of its HR version while its residual face image, defined as the difference between the HR ground-truth and interpolated LR images, contains the missing high-frequency facial details. We demonstrate that supplementing residual images or feature maps with additional facial attribute information can significantly reduce the ambiguity in face super-resolution. To explore this idea, we develop an attribute-embedded upsampling network, which consists of an upsampling network and a discriminative network. The upsampling network is composed of an autoencoder with skip-connections, which incorporates facial attribute vectors into the residual features of LR inputs at the bottleneck of the autoencoder, and deconvolutional layers used for upsampling. The discriminative network is designed to examine whether super-resolved faces contain the desired attributes or not and then its loss is used for updating the upsampling network. In this manner, we can super-resolve tiny (16×16 pixels) unaligned face images with a large upscaling factor of 8× while reducing the uncertainty of one-to-many mappings remarkably. By conducting extensive evaluations on a large-scale dataset, we demonstrate that our method achieves superior face hallucination results and outperforms the state-of-the-art.
给定一张小尺寸人脸图像,现有的人脸超分辨率方法旨在通过从示例数据集学习映射来对其进行高分辨率(HR)重建。由于低分辨率(LR)输入块可能对应于许多 HR 候选块,因此这种歧义可能导致 HR 人脸细节扭曲和错误的属性,如性别反转和年轻化。LR 输入包含其 HR 版本的低频面部成分,而其残差人脸图像(定义为 HR 真实值与插值 LR 图像之间的差异)包含缺失的高频面部细节。我们证明,通过补充残差图像或特征图中的额外面部属性信息,可以显著减少人脸超分辨率中的歧义。为了探索这一想法,我们开发了一种具有属性嵌入的上采样网络,它由上采样网络和判别网络组成。上采样网络由带 skip-connection 的自动编码器组成,该自动编码器将面部属性向量合并到 LR 输入的残差特征中,该残差特征位于自动编码器的瓶颈处,同时使用卷积层进行上采样。判别网络旨在检查超分辨率人脸是否包含所需的属性,如果是,则使用其损失来更新上采样网络。通过这种方式,我们可以以 8 倍的大放大倍数超分辨小尺寸(16×16 像素)未对齐的人脸图像,同时显著降低一对多映射的不确定性。通过在大规模数据集上进行广泛评估,我们证明了我们的方法可以实现优越的人脸幻觉效果,并优于现有技术。