IEEE Trans Vis Comput Graph. 2022 May;28(5):2201-2211. doi: 10.1109/TVCG.2022.3150485. Epub 2022 Apr 8.
We propose a marker-based geometric framework for the high-frequency absolute 3D pose estimation of a binocular camera system by using the data captured during the exposure of a single rolling shutter scanline. In contrast to existing approaches enforcing temporal or motion models among scanlines (e.g. linear motion, constant velocity or small motion assumptions), we strive to determine the pose from instantaneous binocular capture (i.e. without using data from previous scanlines) and achieve drift-free pose estimation. We leverage the projective invariants of a novel rigid planar pattern, to both define a geometric reference as well as to determine 2D-3D correspondences from raw edge detection measurements from individual scanlines. Moreover, to tackle the ensuing multi-view estimation problem, achieve real-time operation, and minimize latency, we develop a pair of custom solvers leveraging our geometric setup. To mitigate sensitivity to noise, we propose a geometrically consistent measurement refinement mechanism. We verify the quality of our solvers by comparing with state of the art general solvers for absolute pose estimation of generalized cameras. Finally, we demonstrate the effectiveness of our proposed approach with an FPGA-based implementation which achieves a localization throughput of 129.6 KHz with a 1.5 μs latency.
我们提出了一种基于标记的几何框架,用于通过使用单个卷帘扫描线曝光期间捕获的数据来对双目相机系统进行高频绝对 3D 姿态估计。与在扫描线之间强制执行时间或运动模型的现有方法(例如线性运动、恒定速度或小运动假设)相比,我们努力从瞬时双目捕获确定姿态(即不使用前一个扫描线的数据)并实现无漂移的姿态估计。我们利用新颖的刚性平面图案的射影不变量,不仅定义了一个几何参考,还从各个扫描线的原始边缘检测测量中确定了 2D-3D 对应关系。此外,为了解决随之而来的多视图估计问题、实现实时操作和最小化延迟,我们开发了一对利用我们的几何设置的自定义求解器。为了减轻对噪声的敏感性,我们提出了一种几何一致的测量细化机制。我们通过与广义相机绝对姿态估计的最新通用求解器进行比较来验证我们求解器的质量。最后,我们通过基于 FPGA 的实现展示了我们提出的方法的有效性,该实现实现了 129.6 kHz 的定位吞吐量和 1.5 μs 的延迟。