Faculty of Computer and Information Science, University of Ljubljana, 1000 Ljubljana, Slovenia.
Faculty of Electrical Engineering, University of Ljubljana, 1000 Ljubljana, Slovenia.
Sensors (Basel). 2022 Jul 16;22(14):5332. doi: 10.3390/s22145332.
The task of reconstructing 3D scenes based on visual data represents a longstanding problem in computer vision. Common reconstruction approaches rely on the use of multiple volumetric primitives to describe complex objects. Superquadrics (a class of volumetric primitives) have shown great promise due to their ability to describe various shapes with only a few parameters. Recent research has shown that deep learning methods can be used to accurately reconstruct random superquadrics from both 3D point cloud data and simple depth images. In this paper, we extended these reconstruction methods to intensity and color images. Specifically, we used a dedicated convolutional neural network (CNN) model to reconstruct a single superquadric from the given input image. We analyzed the results in a qualitative and quantitative manner, by visualizing reconstructed superquadrics as well as observing error and accuracy distributions of predictions. We showed that a CNN model designed around a simple ResNet backbone can be used to accurately reconstruct superquadrics from images containing one object, but only if one of the spatial parameters is fixed or if it can be determined from other image characteristics, e.g., shadows. Furthermore, we experimented with images of increasing complexity, for example, by adding textures, and observed that the results degraded only slightly. In addition, we show that our model outperforms the current state-of-the-art method on the studied task. Our final result is a highly accurate superquadric reconstruction model, which can also reconstruct superquadrics from real images of simple objects, without additional training.
基于视觉数据进行 3D 场景重建是计算机视觉中的一个长期存在的问题。常见的重建方法依赖于使用多个体积元来描述复杂物体。超二次曲面(一类体积元)由于仅用几个参数就能描述各种形状,因此具有很大的潜力。最近的研究表明,深度学习方法可以用于从 3D 点云数据和简单的深度图像中准确地重建随机超二次曲面。在本文中,我们将这些重建方法扩展到强度和颜色图像。具体来说,我们使用专门的卷积神经网络(CNN)模型从给定的输入图像中重建单个超二次曲面。我们通过可视化重建的超二次曲面以及观察预测误差和准确性分布,以定性和定量的方式分析结果。我们表明,围绕简单 ResNet 骨干网络设计的 CNN 模型可以从包含一个物体的图像中准确地重建超二次曲面,但前提是其中一个空间参数是固定的,或者可以从其他图像特征(例如阴影)中确定。此外,我们还对越来越复杂的图像进行了实验,例如添加纹理,并观察到结果仅略有下降。此外,我们表明,我们的模型在研究任务上优于当前的最先进方法。我们的最终结果是一个高度准确的超二次曲面重建模型,它还可以从简单物体的真实图像中重建超二次曲面,而无需额外的训练。