• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于生成语义关节场的单目到3D虚拟试穿

Monocular-to-3D Virtual Try-On with Generative Semantic Articulated Fields.

作者信息

Xie Zhenyu, Zhao Fuwei, Zheng Jun, Dong Xin, Zhu Feida, Liang Xiaodan

出版信息

IEEE Trans Pattern Anal Mach Intell. 2025 Jul 21;PP. doi: 10.1109/TPAMI.2025.3591072.

DOI:10.1109/TPAMI.2025.3591072
PMID:40690350
Abstract

We introduce a monocular-to-3D virtual try-on network based on a conditional 3D-aware Generative Adversarial Network (3D-GAN) for synthesizing multi-view try-on results from single monocular images. In contrast to previous 3D virtual try-on methods that rely on costly scanned meshes or pseudo-depth maps for supervision, our approach utilizes a conditional 3D-GAN trained solely on 2D images, greatly simplifying dataset construction and enhancing model scalability. Specifically, we propose a Generative monocular-to-3D Virtual Try-ON network (G3D-VTON) that integrates a 3D-aware conditional Parsing Module (3DPM), a U-Net Refinement Module (URM), and a Flow-based 2D Virtual Try-On Module (FTM). In our framework, the 3DPM is designed to generate a 3D representation of the virtual try-on result, thereby enabling multi-view rendering. To accomplish this, it is implemented using conditional generative semantic articulated fields, which leverage the 3D SMPL prior via inverse skinning to learn the Signed Distance Function (SDF) of the try-on results in a canonical pose space. This learned SDF enables the rendering of both a coarse human parsing map and a preliminary try-on output with explicit camera control. Furthermore, within 3DPM, we introduce deferred pose guidance to decouple style and pose conditions during training, thereby facilitating view controllable generation during inference. However, the rendered human parsing and try-on results exhibit imprecise shapes and blurry textures. To address these issues, the URM subsequently refines these rendered outputs using a refinement U-Net, and the FTM integrates the refined results with the 2D warped garment to generate the final try-on output with more accurate and realistic appearance details. Extensive experiments demonstrate that the proposed G3D-VTON effectively manipulates and generates faithful 3D human appearances wearing the desired garment, outperforming both 3D-GAN and depth-based 3D approaches while delivering superior visual results in 2D.

摘要

我们基于条件3D感知生成对抗网络(3D-GAN)引入了一种单目到3D的虚拟试穿网络,用于从单目图像合成多视图试穿结果。与以往依赖昂贵的扫描网格或伪深度图进行监督的3D虚拟试穿方法不同,我们的方法利用仅在2D图像上训练的条件3D-GAN,极大地简化了数据集构建并提高了模型的可扩展性。具体而言,我们提出了一种生成式单目到3D虚拟试穿网络(G3D-VTON),它集成了一个3D感知条件解析模块(3DPM)、一个U-Net细化模块(URM)和一个基于流的2D虚拟试穿模块(FTM)。在我们的框架中,3DPM旨在生成虚拟试穿结果的3D表示,从而实现多视图渲染。为了实现这一点,它使用条件生成语义关节场来实现,该场通过反向蒙皮利用3D SMPL先验知识,在规范姿态空间中学习试穿结果的符号距离函数(SDF)。这种学习到的SDF能够渲染粗略的人体解析图和具有明确相机控制的初步试穿输出。此外,在3DPM中,我们引入了延迟姿态引导,以在训练期间解耦风格和姿态条件,从而便于在推理期间进行视图可控生成。然而,渲染的人体解析和试穿结果呈现出不精确的形状和模糊的纹理。为了解决这些问题,URM随后使用细化U-Net对这些渲染输出进行细化,FTM将细化后的结果与2D变形服装集成,以生成具有更准确和逼真外观细节的最终试穿输出。大量实验表明,所提出的G3D-VTON能够有效地操纵并生成穿着所需服装的逼真3D人体外观,优于3D-GAN和基于深度的3D方法,同时在2D中提供卓越的视觉效果。

相似文献

1
Monocular-to-3D Virtual Try-On with Generative Semantic Articulated Fields.基于生成语义关节场的单目到3D虚拟试穿
IEEE Trans Pattern Anal Mach Intell. 2025 Jul 21;PP. doi: 10.1109/TPAMI.2025.3591072.
2
Structural semantic-guided MR synthesis from PET images via a dual cross-attention mechanism.通过双交叉注意力机制从PET图像进行结构语义引导的MR合成。
Med Phys. 2025 Jul;52(7):e17957. doi: 10.1002/mp.17957.
3
Noise-aware system generative model (NASGM): positron emission tomography (PET) image simulation framework with observer validation studies.噪声感知系统生成模型(NASGM):用于正电子发射断层扫描(PET)图像模拟框架及观察者验证研究。
Med Phys. 2025 Jul;52(7):e17962. doi: 10.1002/mp.17962.
4
A medical image classification method based on self-regularized adversarial learning.基于自正则化对抗学习的医学图像分类方法。
Med Phys. 2024 Nov;51(11):8232-8246. doi: 10.1002/mp.17320. Epub 2024 Jul 30.
5
Diffusion semantic segmentation model: A generative model for medical image segmentation based on joint distribution.扩散语义分割模型:一种基于联合分布的医学图像分割生成模型。
Med Phys. 2025 Jul;52(7):e17928. doi: 10.1002/mp.17928. Epub 2025 Jun 8.
6
Sparse-view spectral CT reconstruction via a coupled subspace representation and score-based generative model.基于耦合子空间表示和基于分数的生成模型的稀疏视图光谱CT重建
Quant Imaging Med Surg. 2025 Jun 6;15(6):5474-5495. doi: 10.21037/qims-24-2226. Epub 2025 May 28.
7
Simulating dynamic tumor contrast enhancement in breast MRI using conditional generative adversarial networks.使用条件生成对抗网络模拟乳腺MRI中的动态肿瘤对比增强。
J Med Imaging (Bellingham). 2025 Nov;12(Suppl 2):S22014. doi: 10.1117/1.JMI.12.S2.S22014. Epub 2025 Jun 28.
8
Non-orthogonal kV imaging guided patient position verification in non-coplanar radiation therapy with dataset-free implicit neural representation.在无数据集隐式神经表示的非共面放射治疗中,基于非正交千伏成像的患者体位验证
Med Phys. 2025 May 19. doi: 10.1002/mp.17885.
9
Leveraging Physics-Based Synthetic MR Images and Deep Transfer Learning for Artifact Reduction in Echo-Planar Imaging.利用基于物理的合成磁共振图像和深度迁移学习减少回波平面成像中的伪影
AJNR Am J Neuroradiol. 2025 Apr 2;46(4):733-741. doi: 10.3174/ajnr.A8566.
10
Exploring the Potential of Electroencephalography Signal-Based Image Generation Using Diffusion Models: Integrative Framework Combining Mixed Methods and Multimodal Analysis.利用扩散模型探索基于脑电图信号的图像生成潜力:结合混合方法和多模态分析的综合框架
JMIR Med Inform. 2025 Jun 25;13:e72027. doi: 10.2196/72027.