Zhou Bing, Guven Sinem
IEEE Trans Vis Comput Graph. 2020 Dec;26(12):3514-3523. doi: 10.1109/TVCG.2020.3023635. Epub 2020 Nov 10.
Augmented Reality is increasingly explored as the new medium for two-way remote collaboration applications to guide the participants more effectively and efficiently via visual instructions. As users strive for more natural interaction and automation in augmented reality applications, new visual recognition techniques are needed to enhance the user experience. Although simple object recognition is often used in augmented reality towards this goal, most collaboration tasks are too complex for such recognition algorithms to suffice. In this paper, we propose a fine-grained visual recognition approach for mobile augmented reality, which leverages RGB video frames and sparse depth feature points identified in real-time, as well as camera pose data to detect various visual states of an object. We demonstrate the value of our approach through a mobile application designed for hardware support, which automatically detects the state of an object to present the right set of information in the right context.
增强现实作为一种新媒介,正越来越多地被用于双向远程协作应用,通过视觉指令更有效地引导参与者。随着用户在增强现实应用中追求更自然的交互和自动化,需要新的视觉识别技术来提升用户体验。尽管简单的物体识别常用于增强现实以实现这一目标,但大多数协作任务对于此类识别算法来说过于复杂,难以满足需求。在本文中,我们提出了一种用于移动增强现实的细粒度视觉识别方法,该方法利用实时识别的RGB视频帧和稀疏深度特征点,以及相机姿态数据来检测物体的各种视觉状态。我们通过一个为硬件支持而设计的移动应用展示了我们方法的价值,该应用能自动检测物体状态,以便在正确的情境中呈现正确的信息集。