Cao Xin, Li Jia, Zhao Panpan, Li Jiachen, Qin Xueying
IEEE Trans Vis Comput Graph. 2024 May;30(5):2173-2183. doi: 10.1109/TVCG.2024.3372111. Epub 2024 Apr 19.
Category-level pose tracking methods can continuously track the pose of objects without requiring any prior knowledge of the specific shape of the tracked instance. This makes them advantageous in augmented reality and virtual reality applications. The key challenge is how to train neural networks to accurately predict the poses of objects they have never seen before and exhibit strong generalization performance. We propose a novel category-level 6D pose tracking method Corr-Track, which is capable of accurately tracking objects belonging to the same category from depth video streams. Our approach utilizes direct soft correspondence constraints to train a neural network, which estimates bidirectional soft correspondences between sparsely sampled point clouds of objects in two frames. We first introduce a soft correspondence matrix for pose tracking tasks and establish effective constraints through direct spatial point-to-point correspondence representations in the sparse point cloud correspondence matrix. We propose the "point cloud expansion" strategy to address the "point cloud shrinkage" problem resulting from soft correspondences. This strategy ensures that the corresponding point cloud accurately reproduces the shape of the target point cloud, leading to precise pose tracking results. We evaluated our approach on the NOCS-REAL275 and Wild6D dataset and observed superior performance compared to previous methods. Additionally, we conducted cross-category experiments that further demonstrated its generalization capability.
类别级姿态跟踪方法可以连续跟踪物体的姿态,而无需任何关于被跟踪实例特定形状的先验知识。这使得它们在增强现实和虚拟现实应用中具有优势。关键挑战在于如何训练神经网络,以准确预测它们从未见过的物体的姿态,并展现出强大的泛化性能。我们提出了一种新颖的类别级6D姿态跟踪方法Corr-Track,它能够从深度视频流中准确跟踪属于同一类别的物体。我们的方法利用直接软对应约束来训练神经网络,该网络估计两帧中物体稀疏采样点云之间的双向软对应关系。我们首先为姿态跟踪任务引入一个软对应矩阵,并通过稀疏点云对应矩阵中的直接空间点对点对应表示来建立有效的约束。我们提出“点云扩展”策略来解决由软对应导致的“点云收缩”问题。该策略确保对应点云准确再现目标点云的形状,从而得到精确的姿态跟踪结果。我们在NOCS-REAL275和Wild6D数据集上评估了我们的方法,与之前的方法相比,观察到了卓越的性能。此外,我们进行了跨类别实验,进一步证明了其泛化能力。