Suppr超能文献

用于单视图3D形状重建的视图感知几何结构联合学习

View-Aware Geometry-Structure Joint Learning for Single-View 3D Shape Reconstruction.

作者信息

Zhang Xuancheng, Ma Rui, Zou Changqing, Zhang Minghao, Zhao Xibin, Gao Yue

出版信息

IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):6546-6561. doi: 10.1109/TPAMI.2021.3090917. Epub 2022 Sep 15.

Abstract

Reconstructing a 3D shape from a single-view image using deep learning has become increasingly popular recently. Most existing methods only focus on reconstructing the 3D shape geometry based on image constraints. The lack of explicit modeling of structure relations among shape parts yields low-quality reconstruction results for structure-rich man-made shapes. In addition, conventional 2D-3D joint embedding architecture for image-based 3D shape reconstruction often omits the specific view information from the given image, which may lead to degraded geometry and structure reconstruction. We address these problems by introducing VGSNet, an encoder-decoder architecture for view-aware joint geometry and structure learning. The key idea is to jointly learn a multimodal feature representation of 2D image, 3D shape geometry and structure so that both geometry and structure details can be reconstructed from a single-view image. To this end, we explicitly represent 3D shape structures as part relations and employ image supervision to guide the geometry and structure reconstruction. Trained with pairs of view-aligned images and 3D shapes, the VGSNet implicitly encodes the view-aware shape information in the latent feature space. Qualitative and quantitative comparisons with the state-of-the-art baseline methods as well as ablation studies demonstrate the effectiveness of the VGSNet for structure-aware single-view 3D shape reconstruction.

摘要

近年来,利用深度学习从单视图图像重建三维形状变得越来越流行。大多数现有方法仅专注于基于图像约束来重建三维形状几何结构。由于缺乏对形状部件之间结构关系的显式建模,对于结构丰富的人造形状,重建结果质量较低。此外,用于基于图像的三维形状重建的传统二维 - 三维联合嵌入架构通常会忽略给定图像中的特定视图信息,这可能导致几何结构和结构重建质量下降。我们通过引入VGSNet来解决这些问题,VGSNet是一种用于视图感知联合几何结构和结构学习的编码器 - 解码器架构。关键思想是联合学习二维图像、三维形状几何结构和结构的多模态特征表示,以便能够从单视图图像中重建几何结构和结构细节。为此,我们将三维形状结构明确表示为部件关系,并采用图像监督来指导几何结构和结构重建。通过使用视图对齐的图像和三维形状对进行训练,VGSNet在潜在特征空间中隐式编码视图感知形状信息。与最先进的基线方法进行的定性和定量比较以及消融研究证明了VGSNet在结构感知单视图三维形状重建方面的有效性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验