Suppr超能文献

通过潜在空间评估实现无偏差高质量肖像。

Toward Unbiased High-Quality Portraits through Latent-Space Evaluation.

作者信息

Almhaithawi Doaa, Bellini Alessandro, Cerquitelli Tania

机构信息

Department of Control and Computer Engineering, Politecnico di Torino, 10129 Torino, Italy.

Prime Lab, Mathema s.r.l., 50142 Florence, Italy.

出版信息

J Imaging. 2024 Jun 28;10(7):157. doi: 10.3390/jimaging10070157.

Abstract

Images, texts, voices, and signals can be synthesized by latent spaces in a multidimensional vector, which can be explored without the hurdles of noise or other interfering factors. In this paper, we present a practical use case that demonstrates the power of latent space in exploring complex realities such as image space. We focus on DaVinciFace, an AI-based system that explores the StyleGAN2 space to create a high-quality portrait for anyone in the style of the Renaissance genius Leonardo da Vinci. The user enters one of their portraits and receives the corresponding Da Vinci-style portrait as an output. Since most of Da Vinci's artworks depict young and beautiful women (e.g., "La Belle Ferroniere", "Beatrice de' Benci"), we investigate the ability of DaVinciFace to account for other social categorizations, including gender, race, and age. The experimental results evaluate the effectiveness of our methodology on 1158 portraits acting on the vector representations of the latent space to produce high-quality portraits that retain the facial features of the subject's social categories, and conclude that sparser vectors have a greater effect on these features. To objectively evaluate and quantify our results, we solicited human feedback via a crowd-sourcing campaign. Analysis of the human feedback showed a high tolerance for the loss of important identity features in the resulting portraits when the Da Vinci style is more pronounced, with some exceptions, including Africanized individuals.

摘要

图像、文本、声音和信号可以由多维向量中的潜在空间合成,在这个潜在空间中可以不受噪声或其他干扰因素的阻碍进行探索。在本文中,我们展示了一个实际应用案例,该案例展示了潜在空间在探索诸如图像空间等复杂现实方面的强大作用。我们聚焦于DaVinciFace,这是一个基于人工智能的系统,它探索StyleGAN2空间,以文艺复兴时期天才列奥纳多·达·芬奇的风格为任何人创作高质量肖像。用户输入自己的一张肖像,然后收到相应的达·芬奇风格肖像作为输出。由于达·芬奇的大多数艺术作品描绘的是年轻美丽的女性(例如,《美丽的费隆妮叶夫人》《比阿特丽斯·德·本奇》),我们研究了DaVinciFace处理包括性别、种族和年龄在内的其他社会分类的能力。实验结果评估了我们的方法在1158张肖像上的有效性,这些肖像作用于潜在空间的向量表示,以生成保留主体社会分类面部特征的高质量肖像,并得出结论:更稀疏的向量对这些特征有更大的影响。为了客观地评估和量化我们的结果,我们通过众包活动征求了人类反馈。对人类反馈的分析表明,当达·芬奇风格更明显时,人们对所得肖像中重要身份特征的丢失具有较高的容忍度,但也有一些例外,包括非洲裔个体。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/af6e/11278512/b6254eb1f351/jimaging-10-00157-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验