FaSS-MVS：基于无人机单目图像的具有表面感知半全局匹配的快速多视图立体视觉

FaSS-MVS: Fast Multi-View Stereo with Surface-Aware Semi-Global Matching from UAV-Borne Monocular Imagery.

作者信息

Ruf Boitumelo, Weinmann Martin, Hinz Stefan

机构信息

Fraunhofer Institute of Optronics, System Technologies and Image Exploitation IOSB, 76131 Karlsruhe, Germany.

Institute of Photogrammetry and Remote Sensing, Karlsruhe Institute of Technology, 76131 Karlsruhe, Germany.

出版信息

Sensors (Basel). 2024 Oct 2;24(19):6397. doi: 10.3390/s24196397.

DOI:10.3390/s24196397

PMID:39409439

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11479275/

Abstract

With FaSS-MVS, we present a fast, surface-aware semi-global optimization approach for multi-view stereo that allows for rapid depth and normal map estimation from monocular aerial video data captured by unmanned aerial vehicles (UAVs). The data estimated by FaSS-MVS, in turn, facilitate online 3D mapping, meaning that a 3D map of the scene is immediately and incrementally generated as the image data are acquired or being received. FaSS-MVS is composed of a hierarchical processing scheme in which depth and normal data, as well as corresponding confidence scores, are estimated in a coarse-to-fine manner, allowing efficient processing of large scene depths, such as those inherent in oblique images acquired by UAVs flying at low altitudes. The actual depth estimation uses a plane-sweep algorithm for dense multi-image matching to produce depth hypotheses from which the actual depth map is extracted by means of a surface-aware semi-global optimization, reducing the fronto-parallel bias of Semi-Global Matching (SGM). Given the estimated depth map, the pixel-wise surface normal information is then computed by reprojecting the depth map into a point cloud and computing the normal vectors within a confined local neighborhood. In a thorough quantitative and ablative study, we show that the accuracy of the 3D information computed by FaSS-MVS is close to that of state-of-the-art offline multi-view stereo approaches, with the error not even an order of magnitude higher than that of COLMAP. At the same time, however, the average runtime of FaSS-MVS for estimating a single depth and normal map is less than 14% of that of COLMAP, allowing us to perform online and incremental processing of full HD images at 1-2 Hz.

摘要

借助FaSS-MVS，我们提出了一种用于多视图立体视觉的快速、表面感知半全局优化方法，该方法能够根据无人机（UAV）拍摄的单目航拍视频数据快速估计深度图和法线图。反过来，FaSS-MVS估计的数据有助于在线3D映射，这意味着在获取或接收图像数据时会立即并逐步生成场景的3D地图。FaSS-MVS由一种分层处理方案组成，其中深度和法线数据以及相应的置信度分数以粗到精的方式进行估计，从而能够高效处理大场景深度，例如低空飞行的无人机获取的倾斜图像中固有的深度。实际的深度估计使用平面扫描算法进行密集多图像匹配，以生成深度假设，然后通过表面感知半全局优化从中提取实际深度图，减少半全局匹配（SGM）的正面平行偏差。给定估计的深度图，然后通过将深度图重新投影到点云中并在有限的局部邻域内计算法线向量来计算逐像素的表面法线信息。在一项全面的定量和消融研究中，我们表明，FaSS-MVS计算的3D信息的准确性接近最先进的离线多视图立体视觉方法，其误差甚至比COLMAP的误差高不到一个数量级。然而，与此同时，FaSS-MVS估计单个深度图和法线图的平均运行时间不到COLMAP的14%，这使我们能够以1-2Hz的频率对全高清图像进行在线和增量处理。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/26ae/11479275/6d04c4d012fc/sensors-24-06397-g0A1.jpg

相似文献

FaSS-MVS: Fast Multi-View Stereo with Surface-Aware Semi-Global Matching from UAV-Borne Monocular Imagery.

Sensors (Basel). 2024 Oct 2;24(19):6397. doi: 10.3390/s24196397.

Neural Radiance Field-Inspired Depth Map Refinement for Accurate Multi-View Stereo.

J Imaging. 2024 Mar 8;10(3):68. doi: 10.3390/jimaging10030068.

Depth-Guided Optimization of Neural Radiance Fields for Indoor Multi-View Stereo.

IEEE Trans Pattern Anal Mach Intell. 2023 Sep;45(9):10835-10849. doi: 10.1109/TPAMI.2023.3263464. Epub 2023 Aug 7.

Miper-MVS: Multi-scale iterative probability estimation with refinement for efficient multi-view stereo.

Neural Netw. 2023 May;162:502-515. doi: 10.1016/j.neunet.2023.03.012. Epub 2023 Mar 17.

MVS-T: A Coarse-to-Fine Multi-View Stereo Network with Transformer for Low-Resolution Images 3D Reconstruction.

Sensors (Basel). 2022 Oct 9;22(19):7659. doi: 10.3390/s22197659.

Studies on Three-Dimensional (3D) Accuracy Optimization and Repeatability of UAV in Complex Pit-Rim Landforms As Assisted by Oblique Imaging and RTK Positioning.

Sensors (Basel). 2021 Dec 4;21(23):8109. doi: 10.3390/s21238109.

Visibility-Aware Point-Based Multi-View Stereo Network.

IEEE Trans Pattern Anal Mach Intell. 2021 Oct;43(10):3695-3708. doi: 10.1109/TPAMI.2020.2988729. Epub 2021 Sep 2.

Mobile3DRecon: Real-time Monocular 3D Reconstruction on a Mobile Phone.

IEEE Trans Vis Comput Graph. 2020 Dec;26(12):3446-3456. doi: 10.1109/TVCG.2020.3023634. Epub 2020 Nov 10.

Unmanned aerial image dataset: Ready for 3D reconstruction.

Data Brief. 2019 May 24;25:103962. doi: 10.1016/j.dib.2019.103962. eCollection 2019 Aug.

Rapid 3D Reconstruction for Image Sequence Acquired from UAV Camera.

Sensors (Basel). 2018 Jan 14;18(1):225. doi: 10.3390/s18010225.

本文引用的文献

Multi-Scale Geometric Consistency Guided and Planar Prior Assisted Multi-View Stereo.

IEEE Trans Pattern Anal Mach Intell. 2023 Apr;45(4):4945-4963. doi: 10.1109/TPAMI.2022.3200074. Epub 2023 Mar 7.

ReStAC-UAV-Borne Real-Time SGM Stereo Optimized for Embedded ARM and CUDA Devices.

Sensors (Basel). 2021 Jun 7;21(11):3938. doi: 10.3390/s21113938.

Accurate, dense, and robust multiview stereopsis.

IEEE Trans Pattern Anal Mach Intell. 2010 Aug;32(8):1362-76. doi: 10.1109/TPAMI.2009.161.

Stereo processing by semiglobal matching and mutual information.

IEEE Trans Pattern Anal Mach Intell. 2008 Feb;30(2):328-41. doi: 10.1109/TPAMI.2007.1166.

MonoSLAM: real-time single camera SLAM.

IEEE Trans Pattern Anal Mach Intell. 2007 Jun;29(6):1052-67. doi: 10.1109/TPAMI.2007.1049.

Sampling the disparity space image.

IEEE Trans Pattern Anal Mach Intell. 2004 Mar;26(3):419-25. doi: 10.1109/TPAMI.2004.1262341.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

FaSS-MVS：基于无人机单目图像的具有表面感知半全局匹配的快速多视图立体视觉

FaSS-MVS: Fast Multi-View Stereo with Surface-Aware Semi-Global Matching from UAV-Borne Monocular Imagery.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献