基于增强现实（AR）系统中多目标约束的三维跟踪与配准方法研究

Research of the three-dimensional tracking and registration method based on multiobjective constraints in an AR system.

作者信息

An Zhe, Xu Xiping, Yang Jinhua, Liu Yang, Yan Yuxuan

出版信息

Appl Opt. 2018 Nov 10;57(32):9625-9634. doi: 10.1364/AO.57.009625.

DOI:10.1364/AO.57.009625

Abstract

To match the virtual image and actual environment in an augmented reality (AR) system, it is necessary to complete the task of three-dimensional (3D) tracking registration. This paper proposes a new method for 3D tracking registration. Previous methods extract feature points in images to realize tracking registration. In this paper, the objects are extracted from the deep convolution neural network in the scene, and the camera pose is estimated by establishing the constraint relation of the objects. Then, 3D tracking and registration of the virtual object are realized. We design an improved single-shot multibox detector semantic segmentation network to identify and segment the scene and extract the pixel classification results of the objects in the scene. The effect of classification with this method is better. The depth of the extracted object is estimated based on the data from the left and right cameras, and the 2D image is converted into a 3D point cloud. A camera pose estimation method, combined with multiobjective information, is proposed. The camera transformation matrix is directly estimated by establishing a mathematical model. This method avoids the effect on the accuracy of the camera pose estimation when the feature points are not sufficient. Moreover, by assigning different weights to the point clouds of different objects, errors caused by the model can be reduced. The experimental results showed that the 3D registration method proposed in this paper is less than 2.5 pixels in the application scene of an augmented reality head-up display. This method had a better effect compared with that of existing methods and also improved driving safety.

摘要

为了在增强现实（AR）系统中使虚拟图像与实际环境相匹配，有必要完成三维（3D）跟踪注册任务。本文提出了一种新的3D跟踪注册方法。以往的方法是在图像中提取特征点来实现跟踪注册。本文从场景中的深度卷积神经网络中提取对象，并通过建立对象的约束关系来估计相机姿态。然后，实现虚拟对象的3D跟踪和注册。我们设计了一种改进的单阶段多框检测器语义分割网络来识别和分割场景，并提取场景中对象的像素分类结果。用这种方法进行分类的效果更好。基于左右相机的数据估计提取对象的深度，并将二维图像转换为三维点云。提出了一种结合多目标信息的相机姿态估计方法。通过建立数学模型直接估计相机变换矩阵。该方法避免了特征点不足时对相机姿态估计精度的影响。此外，通过为不同对象的点云分配不同的权重，可以减少模型引起的误差。实验结果表明，本文提出的3D注册方法在增强现实平视显示器的应用场景中误差小于2.5像素。与现有方法相比，该方法具有更好的效果，同时也提高了驾驶安全性。