Suppr超能文献

通过潜在风格空间操纵实现可控的无监督雪景合成。

Controllable Unsupervised Snow Synthesis by Latent Style Space Manipulation.

作者信息

Yang Hanting, Carballo Alexander, Zhang Yuxiao, Takeda Kazuya

机构信息

Graduate School of Informatics, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan.

Faculty of Engineering, Graduate School of Engineering, Gifu University, 1-1 Yanagido, Gifu City 501-1193, Japan.

出版信息

Sensors (Basel). 2023 Oct 12;23(20):8398. doi: 10.3390/s23208398.

Abstract

In the field of intelligent vehicle technology, there is a high dependence on images captured under challenging conditions to develop robust perception algorithms. However, acquiring these images can be both time-consuming and dangerous. To address this issue, unpaired image-to-image translation models offer a solution by synthesizing samples of the desired domain, thus eliminating the reliance on ground truth supervision. However, the current methods predominantly focus on single projections rather than multiple solutions, not to mention controlling the direction of generation, which creates a scope for enhancement. In this study, we propose a generative adversarial network (GAN)-based model, which incorporates both a style encoder and a content encoder, specifically designed to extract relevant information from an image. Further, we employ a decoder to reconstruct an image using these encoded features, while ensuring that the generated output remains within a permissible range by applying a self-regression module to constrain the style latent space. By modifying the hyperparameters, we can generate controllable outputs with specific style codes. We evaluate the performance of our model by generating snow scenes on the Cityscapes and the EuroCity Persons datasets. The results reveal the effectiveness of our proposed methodology, thereby reinforcing the benefits of our approach in the ongoing evolution of intelligent vehicle technology.

摘要

在智能车辆技术领域,为了开发强大的感知算法,对在具有挑战性的条件下拍摄的图像有高度的依赖性。然而,获取这些图像既耗时又危险。为了解决这个问题,无配对图像到图像翻译模型通过合成所需领域的样本提供了一种解决方案,从而消除了对真实监督的依赖。然而,当前的方法主要集中在单一投影上,而不是多种解决方案,更不用说控制生成方向了,这就为改进留下了空间。在本研究中,我们提出了一种基于生成对抗网络(GAN)的模型,该模型结合了风格编码器和内容编码器,专门设计用于从图像中提取相关信息。此外,我们使用解码器利用这些编码特征重建图像,同时通过应用自回归模块来约束风格潜在空间,确保生成的输出保持在允许的范围内。通过修改超参数,我们可以生成具有特定风格代码的可控输出。我们通过在Cityscapes和EuroCity Persons数据集上生成雪景来评估我们模型的性能。结果揭示了我们提出的方法的有效性,从而强化了我们的方法在智能车辆技术不断发展中的优势。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验