Suppr超能文献

FaSS-MVS:基于无人机单目图像的具有表面感知半全局匹配的快速多视图立体视觉

FaSS-MVS: Fast Multi-View Stereo with Surface-Aware Semi-Global Matching from UAV-Borne Monocular Imagery.

作者信息

Ruf Boitumelo, Weinmann Martin, Hinz Stefan

机构信息

Fraunhofer Institute of Optronics, System Technologies and Image Exploitation IOSB, 76131 Karlsruhe, Germany.

Institute of Photogrammetry and Remote Sensing, Karlsruhe Institute of Technology, 76131 Karlsruhe, Germany.

出版信息

Sensors (Basel). 2024 Oct 2;24(19):6397. doi: 10.3390/s24196397.

Abstract

With FaSS-MVS, we present a fast, surface-aware semi-global optimization approach for multi-view stereo that allows for rapid depth and normal map estimation from monocular aerial video data captured by unmanned aerial vehicles (UAVs). The data estimated by FaSS-MVS, in turn, facilitate online 3D mapping, meaning that a 3D map of the scene is immediately and incrementally generated as the image data are acquired or being received. FaSS-MVS is composed of a hierarchical processing scheme in which depth and normal data, as well as corresponding confidence scores, are estimated in a coarse-to-fine manner, allowing efficient processing of large scene depths, such as those inherent in oblique images acquired by UAVs flying at low altitudes. The actual depth estimation uses a plane-sweep algorithm for dense multi-image matching to produce depth hypotheses from which the actual depth map is extracted by means of a surface-aware semi-global optimization, reducing the fronto-parallel bias of Semi-Global Matching (SGM). Given the estimated depth map, the pixel-wise surface normal information is then computed by reprojecting the depth map into a point cloud and computing the normal vectors within a confined local neighborhood. In a thorough quantitative and ablative study, we show that the accuracy of the 3D information computed by FaSS-MVS is close to that of state-of-the-art offline multi-view stereo approaches, with the error not even an order of magnitude higher than that of COLMAP. At the same time, however, the average runtime of FaSS-MVS for estimating a single depth and normal map is less than 14% of that of COLMAP, allowing us to perform online and incremental processing of full HD images at 1-2 Hz.

摘要

借助FaSS-MVS,我们提出了一种用于多视图立体视觉的快速、表面感知半全局优化方法,该方法能够根据无人机(UAV)拍摄的单目航拍视频数据快速估计深度图和法线图。反过来,FaSS-MVS估计的数据有助于在线3D映射,这意味着在获取或接收图像数据时会立即并逐步生成场景的3D地图。FaSS-MVS由一种分层处理方案组成,其中深度和法线数据以及相应的置信度分数以粗到精的方式进行估计,从而能够高效处理大场景深度,例如低空飞行的无人机获取的倾斜图像中固有的深度。实际的深度估计使用平面扫描算法进行密集多图像匹配,以生成深度假设,然后通过表面感知半全局优化从中提取实际深度图,减少半全局匹配(SGM)的正面平行偏差。给定估计的深度图,然后通过将深度图重新投影到点云中并在有限的局部邻域内计算法线向量来计算逐像素的表面法线信息。在一项全面的定量和消融研究中,我们表明,FaSS-MVS计算的3D信息的准确性接近最先进的离线多视图立体视觉方法,其误差甚至比COLMAP的误差高不到一个数量级。然而,与此同时,FaSS-MVS估计单个深度图和法线图的平均运行时间不到COLMAP的14%,这使我们能够以1-2Hz的频率对全高清图像进行在线和增量处理。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/26ae/11479275/6d04c4d012fc/sensors-24-06397-g0A1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验