Peng Chunlei, Zhang Congyu, Liu Decheng, Wang Nannan, Gao Xinbo
IEEE Trans Image Process. 2023;32:5865-5876. doi: 10.1109/TIP.2023.3326680. Epub 2023 Nov 3.
With the rapid development of generative adversarial networks, face photo-sketch synthesis has achieved promising performance and playing an increasingly important role in law enforcement as well as entertainment. However, most of the existing methods only work under the condition of no interference, and lack of generalization ability in wild scenes. The fidelity of the images generated by the existing methods are insufficient, and the manipulation ability according to text description is unavailable. Directly applying existing text-based image manipulation methods on face photo-sketch scenario may lead to severe distortions due to the cross-domain challenges. Therefore, we propose a novel cross-domain face photo-sketch synthesis framework named HiFiSketch, a network that learns to adjust the weights of generators for high-fidelity synthesis and manipulation. It can realize the translation of images between the photo domain and the sketch domain, and modify results according to the text input in the meanwhile. We further propose a cross-domain loss function, which can effectively preserve facial details during face photo-sketch synthesis. Extensive experiments on four public face sketch datasets show the superiority of our method compared to existing methods. We further present text-based face photo-sketch manipulation and sequential face photo-sketch manipulation for the first time to demonstrate the effectiveness of our method on high fidelity face photo-sketch synthesis and manipulation.
随着生成对抗网络的快速发展,人脸照片-素描合成已取得了可观的成效,并在执法和娱乐领域发挥着越来越重要的作用。然而,现有的大多数方法仅在无干扰的条件下有效,在复杂场景中缺乏泛化能力。现有方法生成的图像保真度不足,且缺乏根据文本描述进行操控的能力。由于跨域挑战,直接将现有的基于文本的图像操控方法应用于人脸照片-素描场景可能会导致严重失真。因此,我们提出了一种名为HiFiSketch的新型跨域人脸照片-素描合成框架,这是一种学习调整生成器权重以实现高保真合成和操控的网络。它可以实现照片域和素描域之间的图像转换,同时根据文本输入修改结果。我们还提出了一种跨域损失函数,该函数可以在人脸照片-素描合成过程中有效地保留面部细节。在四个公开的人脸素描数据集上进行的大量实验表明,我们的方法优于现有方法。我们还首次展示了基于文本的人脸照片-素描操控和连续人脸照片-素描操控,以证明我们的方法在高保真人脸照片-素描合成和操控方面的有效性。