Tan Bin, Xue Nan, Wu Tianfu, Xia Gui-Song
IEEE Trans Pattern Anal Mach Intell. 2023 Dec;45(12):15233-15248. doi: 10.1109/TPAMI.2023.3314745. Epub 2023 Nov 3.
This article studies the challenging two-view 3D reconstruction problem in a rigorous sparse-view configuration, which is suffering from insufficient correspondences in the input image pairs for camera pose estimation. We present a novel Neural One-PlanE RANSAC framework (termed NOPE-SAC in short) that exerts excellent capability of neural networks to learn one-plane pose hypotheses from 3D plane correspondences. Building on the top of a Siamese network for plane detection, our NOPE-SAC first generates putative plane correspondences with a coarse initial pose. It then feeds the learned 3D plane correspondences into shared MLPs to estimate the one-plane camera pose hypotheses, which are subsequently reweighed in a RANSAC manner to obtain the final camera pose. Because the neural one-plane pose minimizes the number of plane correspondences for adaptive pose hypotheses generation, it enables stable pose voting and reliable pose refinement with a few of plane correspondences for the sparse-view inputs. In the experiments, we demonstrate that our NOPE-SAC significantly improves the camera pose estimation for the two-view inputs with severe viewpoint changes, setting several new state-of-the-art performances on two challenging benchmarks, i.e., MatterPort3D and ScanNet, for sparse-view 3D reconstruction.
本文研究了在严格的稀疏视图配置下具有挑战性的双视图3D重建问题,该配置在用于相机姿态估计的输入图像对中存在对应关系不足的问题。我们提出了一种新颖的神经单平面RANSAC框架(简称为NOPE-SAC),该框架发挥神经网络的卓越能力,从3D平面对应关系中学习单平面姿态假设。基于用于平面检测的连体网络,我们的NOPE-SAC首先通过粗略的初始姿态生成假定的平面对应关系。然后,它将学习到的3D平面对应关系输入到共享的多层感知器中,以估计单平面相机姿态假设,随后以RANSAC方式对这些假设进行重新加权,以获得最终的相机姿态。由于神经单平面姿态最小化了用于自适应姿态假设生成的平面对应关系数量,因此它能够通过稀疏视图输入的少量平面对应关系实现稳定的姿态投票和可靠的姿态优化。在实验中,我们证明了我们的NOPE-SAC显著提高了具有严重视角变化的双视图输入的相机姿态估计,在两个具有挑战性的基准测试(即MatterPort3D和ScanNet)上为稀疏视图3D重建设定了几个新的最先进性能。