Li Hai, Yang Xingrui, Zhai Hongjia, Liu Yuqian, Bao Hujun, Zhang Guofeng
IEEE Trans Vis Comput Graph. 2024 Mar;30(3):1743-1755. doi: 10.1109/TVCG.2022.3225844. Epub 2024 Jan 30.
Virtual content creation and interaction play an important role in modern 3D applications. Recovering detailed 3D models from real scenes can significantly expand the scope of its applications and has been studied for decades in the computer vision and computer graphics community. In this work, we propose Vox-Surf, a voxel-based implicit surface representation. Our Vox-Surf divides the space into finite sparse voxels, where each voxel is a basic geometry unit that stores geometry and appearance information on its corner vertices. Due to the sparsity inherited from the voxel representation, Vox-Surf is suitable for almost any scene and can be easily trained end-to-end from multiple view images. We utilize a progressive training process to gradually cull out empty voxels and keep only valid voxels for further optimization, which greatly reduces the number of sample points and improves inference speed. Experiments show that our Vox-Surf representation can learn fine surface details and accurate colors with less memory and faster rendering than previous methods. The resulting fine voxels can also be considered as the bounding volumes for collision detection, which is useful in 3D interactions. We also show the potential application of Vox-Surf in scene editing and augmented reality. The source code is publicly available at https://github.com/zju3dv/Vox-Surf.
虚拟内容创建与交互在现代3D应用中发挥着重要作用。从真实场景中恢复详细的3D模型能够显著扩展其应用范围,并且在计算机视觉和计算机图形学界已经研究了数十年。在这项工作中,我们提出了Vox-Surf,一种基于体素的隐式表面表示方法。我们的Vox-Surf将空间划分为有限的稀疏体素,其中每个体素都是一个基本几何单元,在其角顶点处存储几何和外观信息。由于继承了体素表示的稀疏性,Vox-Surf适用于几乎任何场景,并且可以很容易地从多视图图像进行端到端训练。我们利用一种渐进式训练过程来逐步剔除空体素,只保留有效体素进行进一步优化,这大大减少了采样点数量并提高了推理速度。实验表明,与以前的方法相比,我们的Vox-Surf表示能够以更少的内存和更快的渲染速度学习精细的表面细节和准确的颜色。生成的精细体素也可以被视为用于碰撞检测的包围体,这在3D交互中很有用。我们还展示了Vox-Surf在场景编辑和增强现实中的潜在应用。源代码可在https://github.com/zju3dv/Vox-Surf上公开获取。