Cao Chen, Yu Baocheng, Xu Wenxia, Chen Guojun, Ai Yuming
School of Computer Science and Engineering, Wuhan Institute of Technology, Wuhan 430073, China.
Sensors (Basel). 2024 Sep 3;24(17):5721. doi: 10.3390/s24175721.
To accurately estimate the 6D pose of objects, most methods employ a two-stage algorithm. While such two-stage algorithms achieve high accuracy, they are often slow. Additionally, many approaches utilize encoding-decoding to obtain the 6D pose, with many employing bilinear sampling for decoding. However, bilinear sampling tends to sacrifice the accuracy of precise features. In our research, we propose a novel solution that utilizes implicit representation as a bridge between discrete feature maps and continuous feature maps. We represent the feature map as a coordinate field, where each coordinate pair corresponds to a feature value. These feature values are then used to estimate feature maps of arbitrary scales, replacing upsampling for decoding. We apply the proposed implicit module to a bidirectional fusion feature pyramid network. Based on this implicit module, we propose three network branches: a class estimation branch, a bounding box estimation branch, and the final pose estimation branch. For this pose estimation branch, we propose a miniature dual-stream network, which estimates object surface features and complements the relationship between 2D and 3D. We represent the rotation component using the SVD (Singular Value Decomposition) representation method, resulting in a more accurate object pose. We achieved satisfactory experimental results on the widely used 6D pose estimation benchmark dataset Linemod. This innovative approach provides a more convenient solution for 6D object pose estimation.
为了准确估计物体的6D姿态,大多数方法采用两阶段算法。虽然这种两阶段算法具有很高的精度,但它们通常速度较慢。此外,许多方法利用编码-解码来获得6D姿态,其中许多方法在解码时采用双线性采样。然而,双线性采样往往会牺牲精确特征的准确性。在我们的研究中,我们提出了一种新颖的解决方案,该方案利用隐式表示作为离散特征图和连续特征图之间的桥梁。我们将特征图表示为一个坐标场,其中每个坐标对对应一个特征值。然后,这些特征值用于估计任意尺度的特征图,取代了解码时的上采样。我们将所提出的隐式模块应用于双向融合特征金字塔网络。基于这个隐式模块,我们提出了三个网络分支:一个类别估计分支、一个边界框估计分支和最终的姿态估计分支。对于这个姿态估计分支,我们提出了一个微型双流网络,该网络估计物体表面特征并补充2D和3D之间的关系。我们使用奇异值分解(SVD)表示方法来表示旋转分量,从而得到更准确的物体姿态。我们在广泛使用的6D姿态估计基准数据集Linemod上取得了令人满意的实验结果。这种创新方法为6D物体姿态估计提供了一种更便捷的解决方案。