一种具有面片不确定性感知的轻量级多视图立体方法。

A Light Multi-View Stereo Method with Patch-Uncertainty Awareness.

作者信息

Liu Zhen, Wu Guangzheng, Xie Tao, Li Shilong, Wu Chao, Zhang Zhiming, Zhou Jiali

机构信息

College of Science, Zhejiang University of Technology, Hangzhou 310023, China.

Rept Battero, Wenzhou 325058, China.

出版信息

Sensors (Basel). 2024 Feb 17;24(4):1293. doi: 10.3390/s24041293.

DOI:10.3390/s24041293

PMID:38400452

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10892961/

Abstract

Multi-view stereo methods utilize image sequences from different views to generate a 3D point cloud model of the scene. However, existing approaches often overlook coarse-stage features, impacting the final reconstruction accuracy. Moreover, using a fixed range for all the pixels during inverse depth sampling can adversely affect depth estimation. To address these challenges, we present a novel learning-based multi-view stereo method incorporating attention mechanisms and an adaptive depth sampling strategy. Firstly, we propose a lightweight, coarse-feature-enhanced feature pyramid network in the feature extraction stage, augmented by a coarse-feature-enhanced module. This module integrates features with channel and spatial attention, enriching the contextual features that are crucial for the initial depth estimation. Secondly, we introduce a novel patch-uncertainty-based depth sampling strategy for depth refinement, dynamically configuring depth sampling ranges within the GRU-based optimization process. Furthermore, we incorporate an edge detection operator to extract edge features from the reference image's feature map. These edge features are additionally integrated into the iterative cost volume construction, enhancing the reconstruction accuracy. Lastly, our method is rigorously evaluated on the DTU and Tanks and Temples benchmark datasets, revealing its low GPU memory consumption and competitive reconstruction quality compared to other learning-based MVS methods.

摘要

多视图立体方法利用来自不同视图的图像序列来生成场景的三维点云模型。然而，现有方法常常忽略粗粒度阶段的特征，影响最终的重建精度。此外，在逆深度采样期间对所有像素使用固定范围会对深度估计产生不利影响。为应对这些挑战，我们提出了一种基于学习的新颖多视图立体方法，该方法结合了注意力机制和自适应深度采样策略。首先，我们在特征提取阶段提出了一种轻量级、粗特征增强的特征金字塔网络，并通过一个粗特征增强模块进行扩充。该模块将具有通道和空间注意力的特征进行整合，丰富了对初始深度估计至关重要的上下文特征。其次，我们引入了一种新颖的基于面片不确定性的深度采样策略用于深度细化，在基于门控循环单元（GRU）的优化过程中动态配置深度采样范围。此外，我们纳入了一个边缘检测算子，从参考图像的特征图中提取边缘特征。这些边缘特征被额外整合到迭代代价体构建中，提高重建精度。最后，我们的方法在DTU和“坦克与庙宇”基准数据集上进行了严格评估，结果表明与其他基于学习的多视图立体方法相比，其GPU内存消耗较低且重建质量具有竞争力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47a4/10892961/83b4bf25058c/sensors-24-01293-g001.jpg

相似文献

A Light Multi-View Stereo Method with Patch-Uncertainty Awareness.

Sensors (Basel). 2024 Feb 17;24(4):1293. doi: 10.3390/s24041293.

NR-MVSNet: Learning Multi-View Stereo Based on Normal Consistency and Depth Refinement.

IEEE Trans Image Process. 2023;32:2649-2662. doi: 10.1109/TIP.2023.3272170. Epub 2023 May 12.

OD-MVSNet: Omni-dimensional dynamic multi-view stereo network.

PLoS One. 2024 Aug 15;19(8):e0309029. doi: 10.1371/journal.pone.0309029. eCollection 2024.

Miper-MVS: Multi-scale iterative probability estimation with refinement for efficient multi-view stereo.

Neural Netw. 2023 May;162:502-515. doi: 10.1016/j.neunet.2023.03.012. Epub 2023 Mar 17.

RayMVSNet++: Learning Ray-Based 1D Implicit Fields for Accurate Multi-View Stereo.

IEEE Trans Pattern Anal Mach Intell. 2023 Nov;45(11):13666-13682. doi: 10.1109/TPAMI.2023.3296163. Epub 2023 Oct 3.

MVS-T: A Coarse-to-Fine Multi-View Stereo Network with Transformer for Low-Resolution Images 3D Reconstruction.

Sensors (Basel). 2022 Oct 9;22(19):7659. doi: 10.3390/s22197659.

Visibility-Aware Point-Based Multi-View Stereo Network.

IEEE Trans Pattern Anal Mach Intell. 2021 Oct;43(10):3695-3708. doi: 10.1109/TPAMI.2020.2988729. Epub 2021 Sep 2.

DRI-MVSNet: A depth residual inference network for multi-view stereo images.

PLoS One. 2022 Mar 23;17(3):e0264721. doi: 10.1371/journal.pone.0264721. eCollection 2022.

BSI-MVS: multi-view stereo network with bidirectional semantic information.

Sci Rep. 2024 Mar 21;14(1):6766. doi: 10.1038/s41598-024-55612-6.

Neural Radiance Field-Inspired Depth Map Refinement for Accurate Multi-View Stereo.

J Imaging. 2024 Mar 8;10(3):68. doi: 10.3390/jimaging10030068.

本文引用的文献

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs.

IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):834-848. doi: 10.1109/TPAMI.2017.2699184. Epub 2017 Apr 27.

Accurate, dense, and robust multiview stereopsis.

IEEE Trans Pattern Anal Mach Intell. 2010 Aug;32(8):1362-76. doi: 10.1109/TPAMI.2009.161.

A quasi-dense approach to surface reconstruction from uncalibrated images.

IEEE Trans Pattern Anal Mach Intell. 2005 Mar;27(3):418-433. doi: 10.1109/TPAMI.2005.44.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种具有面片不确定性感知的轻量级多视图立体方法。

A Light Multi-View Stereo Method with Patch-Uncertainty Awareness.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献