Department Augmented Vision, German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
Department of Computer Graphics, Max Planck Institute for Informatics, 66123 Saarbrücken, Germany.
Sensors (Basel). 2019 Oct 22;19(20):4603. doi: 10.3390/s19204603.
Recovery of articulated 3D structure from 2D observations is a challenging computer vision problem with many applications. Current learning-based approaches achieve state-of-the-art accuracy on public benchmarks but are restricted to specific types of objects and motions covered by the training datasets. Model-based approaches do not rely on training data but show lower accuracy on these datasets. In this paper, we introduce a model-based method called (SfAM), which can recover multiple object and motion types without training on extensive data collections. At the same time, it performs on par with learning-based state-of-the-art approaches on public benchmarks and outperforms previous non-rigid structure from motion (NRSfM) methods. SfAM is built upon a general-purpose NRSfM technique while integrating a soft spatio-temporal constraint on the bone lengths. We use alternating optimization strategy to recover optimal geometry (i.e., bone proportions) together with 3D joint positions by enforcing the bone lengths consistency over a series of frames. SfAM is highly robust to noisy 2D annotations, generalizes to arbitrary objects and does not rely on training data, which is shown in extensive experiments on public benchmarks and real video sequences. We believe that it brings a new perspective on the domain of monocular 3D recovery of articulated structures, including human motion capture.
从二维观测中恢复关节 3D 结构是一个具有广泛应用的计算机视觉难题。当前基于学习的方法在公共基准测试中达到了最先进的准确性,但仅限于训练数据集涵盖的特定类型的对象和运动。基于模型的方法不依赖于训练数据,但在这些数据集上的准确性较低。在本文中,我们介绍了一种名为(SfAM)的基于模型的方法,它可以在不依赖于广泛数据集合的情况下恢复多种对象和运动类型。同时,它在公共基准测试上与基于学习的最先进方法表现相当,并优于以前的非刚性运动结构(NRSfM)方法。SfAM 建立在通用的 NRSfM 技术之上,同时在骨骼长度上集成了软时空约束。我们使用交替优化策略通过在一系列帧上强制骨骼长度的一致性来恢复最佳几何形状(即骨骼比例)和 3D 关节位置。SfAM 对噪声二维注释具有高度的鲁棒性,可泛化到任意对象,并且不依赖于训练数据,这在公共基准测试和真实视频序列上的广泛实验中得到了证明。我们相信,它为包括人类运动捕捉在内的单目关节结构 3D 恢复领域带来了新的视角。