沉浸式增强现实的实时语义三维感知。

Live Semantic 3D Perception for Immersive Augmented Reality.

出版信息

IEEE Trans Vis Comput Graph. 2020 May;26(5):2012-2022. doi: 10.1109/TVCG.2020.2973477. Epub 2020 Feb 13.

DOI:10.1109/TVCG.2020.2973477

PMID:32070983

Abstract

Semantic understanding of 3D environments is critical for both the unmanned system and the human involved virtual/augmented reality (VR/AR) immersive experience. Spatially-sparse convolution, taking advantage of the intrinsic sparsity of 3D point cloud data, makes high resolution 3D convolutional neural networks tractable with state-of-the-art results on 3D semantic segmentation problems. However, the exhaustive computations limits the practical usage of semantic 3D perception for VR/AR applications in portable devices. In this paper, we identify that the efficiency bottleneck lies in the unorganized memory access of the sparse convolution steps, i.e., the points are stored independently based on a predefined dictionary, which is inefficient due to the limited memory bandwidth of parallel computing devices (GPU). With the insight that points are continuous as 2D surfaces in 3D space, a chunk-based sparse convolution scheme is proposed to reuse the neighboring points within each spatially organized chunk. An efficient multi-layer adaptive fusion module is further proposed for employing the spatial consistency cue of 3D data to further reduce the computational burden. Quantitative experiments on public datasets demonstrate that our approach works 11× faster than previous approaches with competitive accuracy. By implementing both semantic and geometric 3D reconstruction simultaneously on a portable tablet device, we demo a foundation platform for immersive AR applications.

摘要

三维环境的语义理解对于无人系统和涉及虚拟/增强现实 (VR/AR) 沉浸式体验的人类都至关重要。利用三维点云数据固有的稀疏性，稀疏卷积可以利用最新的三维语义分割问题的结果，实现高分辨率的三维卷积神经网络。然而，由于并行计算设备（GPU）的内存带宽有限，这种详尽的计算限制了语义三维感知在 VR/AR 应用中的实际使用。在本文中，我们确定效率瓶颈在于稀疏卷积步骤的非组织内存访问，即点根据预定义的字典独立存储，由于并行计算设备（GPU）的内存带宽有限，这种方法效率低下。基于点在三维空间中作为二维表面连续的观点，提出了一种基于块的稀疏卷积方案，以在每个空间组织的块内重复使用相邻的点。进一步提出了一种高效的多层自适应融合模块，利用三维数据的空间一致性线索进一步减少计算负担。在公共数据集上的定量实验表明，我们的方法比以前的方法快 11 倍，同时保持了有竞争力的准确性。通过在便携式平板电脑设备上同时实现语义和几何三维重建，我们展示了一个沉浸式 AR 应用的基础平台。

相似文献

Live Semantic 3D Perception for Immersive Augmented Reality.沉浸式增强现实的实时语义三维感知。

IEEE Trans Vis Comput Graph. 2020 May;26(5):2012-2022. doi: 10.1109/TVCG.2020.2973477. Epub 2020 Feb 13.

DXR: A Toolkit for Building Immersive Data Visualizations.DXR：用于构建沉浸式数据可视化的工具包。

IEEE Trans Vis Comput Graph. 2019 Jan;25(1):715-725. doi: 10.1109/TVCG.2018.2865152. Epub 2018 Aug 20.

Real-time deep learning semantic segmentation during intra-operative surgery for 3D augmented reality assistance.术中实时深度学习语义分割用于三维增强现实辅助手术。

Int J Comput Assist Radiol Surg. 2021 Sep;16(9):1435-1445. doi: 10.1007/s11548-021-02432-y. Epub 2021 Jun 24.

[IMMERSIVE SURGICAL NAVIGATION USING SPATIAL INTERACTIVE VIRTUAL REALITY AND HOLOGRAPHIC AUGMENTED REALITY].[使用空间交互式虚拟现实和全息增强现实的沉浸式手术导航]

Nihon Geka Gakkai Zasshi. 2016 Sep;117(5):387-94.

SLAM-based dense surface reconstruction in monocular Minimally Invasive Surgery and its application to Augmented Reality.基于 SLAM 的单目微创手术中密集表面重建及其在增强现实中的应用。

Comput Methods Programs Biomed. 2018 May;158:135-146. doi: 10.1016/j.cmpb.2018.02.006. Epub 2018 Feb 8.

The Hologram in My Hand: How Effective is Interactive Exploration of 3D Visualizations in Immersive Tangible Augmented Reality?手中的全息图：沉浸式可触增强现实中 3D 可视化的交互探索效果如何？

IEEE Trans Vis Comput Graph. 2018 Jan;24(1):457-467. doi: 10.1109/TVCG.2017.2745941. Epub 2017 Aug 29.

Holistic decomposition convolution for effective semantic segmentation of medical volume images.用于医学体图像有效语义分割的整体分解卷积

Med Image Anal. 2019 Oct;57:149-164. doi: 10.1016/j.media.2019.07.003. Epub 2019 Jul 8.

From Virtual Reality to Immersive Analytics in Bioinformatics.从虚拟现实到生物信息学中的沉浸式分析

J Integr Bioinform. 2018 Jul 9;15(2):20180043. doi: 10.1515/jib-2018-0043.

Recovering dense 3D point clouds from single endoscopic image.从单张内窥镜图像中恢复密集三维点云。

Comput Methods Programs Biomed. 2021 Jun;205:106077. doi: 10.1016/j.cmpb.2021.106077. Epub 2021 Apr 3.

Towards real-time photorealistic 3D holography with deep neural networks.基于深度神经网络的实时逼真 3D 全息图技术。

Nature. 2021 Mar;591(7849):234-239. doi: 10.1038/s41586-020-03152-0. Epub 2021 Mar 10.

引用本文的文献

MVS-T: A Coarse-to-Fine Multi-View Stereo Network with Transformer for Low-Resolution Images 3D Reconstruction.MVS-T：一种用于低分辨率图像3D重建的基于Transformer的由粗到精多视图立体网络。

Sensors (Basel). 2022 Oct 9;22(19):7659. doi: 10.3390/s22197659.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

沉浸式增强现实的实时语义三维感知。

Live Semantic 3D Perception for Immersive Augmented Reality.

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献