• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于多掩码和几何一致性约束的无监督单目深度与相机位姿估计

Unsupervised Monocular Depth and Camera Pose Estimation with Multiple Masks and Geometric Consistency Constraints.

作者信息

Zhang Xudong, Zhao Baigan, Yao Jiannan, Wu Guoqing

机构信息

School of Information Science and Technology, Nantong University, Nantong 226019, China.

School of Mechanical Engineering, Nantong University, Nantong 226019, China.

出版信息

Sensors (Basel). 2023 Jun 4;23(11):5329. doi: 10.3390/s23115329.

DOI:10.3390/s23115329
PMID:37300056
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10255976/
Abstract

This paper presents a novel unsupervised learning framework for estimating scene depth and camera pose from video sequences, fundamental to many high-level tasks such as 3D reconstruction, visual navigation, and augmented reality. Although existing unsupervised methods have achieved promising results, their performance suffers in challenging scenes such as those with dynamic objects and occluded regions. As a result, multiple mask technologies and geometric consistency constraints are adopted in this research to mitigate their negative impacts. Firstly, multiple mask technologies are used to identify numerous outliers in the scene, which are excluded from the loss computation. In addition, the identified outliers are employed as a supervised signal to train a mask estimation network. The estimated mask is then utilized to preprocess the input to the pose estimation network, mitigating the potential adverse effects of challenging scenes on pose estimation. Furthermore, we propose geometric consistency constraints to reduce the sensitivity of illumination changes, which act as additional supervised signals to train the network. Experimental results on the KITTI dataset demonstrate that our proposed strategies can effectively enhance the model's performance, outperforming other unsupervised methods.

摘要

本文提出了一种新颖的无监督学习框架,用于从视频序列中估计场景深度和相机位姿,这对于诸如三维重建、视觉导航和增强现实等许多高级任务至关重要。尽管现有的无监督方法已经取得了有前景的成果,但它们在诸如存在动态物体和遮挡区域的具有挑战性的场景中的性能会受到影响。因此,本研究采用了多种掩码技术和几何一致性约束来减轻其负面影响。首先,使用多种掩码技术来识别场景中的大量异常值,并将其从损失计算中排除。此外,将识别出的异常值用作监督信号来训练掩码估计网络。然后,利用估计出的掩码对姿态估计网络的输入进行预处理,减轻具有挑战性的场景对姿态估计的潜在不利影响。此外,我们提出几何一致性约束以降低光照变化的敏感性,这些约束作为额外的监督信号来训练网络。在KITTI数据集上的实验结果表明,我们提出的策略可以有效地提高模型的性能,优于其他无监督方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ec7/10255976/93feade44e72/sensors-23-05329-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ec7/10255976/12ff182e56c3/sensors-23-05329-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ec7/10255976/839923abafc1/sensors-23-05329-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ec7/10255976/c50308cc8ae2/sensors-23-05329-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ec7/10255976/737052477485/sensors-23-05329-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ec7/10255976/d471d5acbda4/sensors-23-05329-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ec7/10255976/aa8edc27cc2c/sensors-23-05329-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ec7/10255976/545914a9be1d/sensors-23-05329-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ec7/10255976/f564e63caa79/sensors-23-05329-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ec7/10255976/93feade44e72/sensors-23-05329-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ec7/10255976/12ff182e56c3/sensors-23-05329-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ec7/10255976/839923abafc1/sensors-23-05329-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ec7/10255976/c50308cc8ae2/sensors-23-05329-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ec7/10255976/737052477485/sensors-23-05329-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ec7/10255976/d471d5acbda4/sensors-23-05329-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ec7/10255976/aa8edc27cc2c/sensors-23-05329-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ec7/10255976/545914a9be1d/sensors-23-05329-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ec7/10255976/f564e63caa79/sensors-23-05329-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ec7/10255976/93feade44e72/sensors-23-05329-g009.jpg

相似文献

1
Unsupervised Monocular Depth and Camera Pose Estimation with Multiple Masks and Geometric Consistency Constraints.基于多掩码和几何一致性约束的无监督单目深度与相机位姿估计
Sensors (Basel). 2023 Jun 4;23(11):5329. doi: 10.3390/s23115329.
2
Masked GAN for Unsupervised Depth and Pose Prediction With Scale Consistency.用于具有尺度一致性的无监督深度和姿态预测的掩码生成对抗网络
IEEE Trans Neural Netw Learn Syst. 2021 Dec;32(12):5392-5403. doi: 10.1109/TNNLS.2020.3044181. Epub 2021 Nov 30.
3
Joint Unsupervised Learning of Depth, Pose, Ground Normal Vector and Ground Segmentation by a Monocular Camera Sensor.基于单目相机传感器的深度、姿势、地面法向量和地面分割的联合无监督学习。
Sensors (Basel). 2020 Jul 3;20(13):3737. doi: 10.3390/s20133737.
4
Unsupervised Estimation of Monocular Depth and VO in Dynamic Environments via Hybrid Masks.通过混合掩码对动态环境中的单目深度和视觉里程计进行无监督估计。
IEEE Trans Neural Netw Learn Syst. 2022 May;33(5):2023-2033. doi: 10.1109/TNNLS.2021.3100895. Epub 2022 May 2.
5
An Unsupervised Monocular Visual Odometry Based on Multi-Scale Modeling.一种基于多尺度建模的无监督单目视觉里程计
Sensors (Basel). 2022 Jul 11;22(14):5193. doi: 10.3390/s22145193.
6
Unsupervised monocular visual odometry via combining instance and RGB information.基于实例和 RGB 信息融合的无监督单目视觉里程计。
Appl Opt. 2022 May 1;61(13):3793-3803. doi: 10.1364/AO.452378.
7
Unsupervised Learning of Monocular Depth and Ego-Motion with Optical Flow Features and Multiple Constraints.基于光流特征和多种约束的单目深度和自身运动的无监督学习。
Sensors (Basel). 2022 Feb 11;22(4):1383. doi: 10.3390/s22041383.
8
PMIndoor: Pose Rectified Network and Multiple Loss Functions for Self-Supervised Monocular Indoor Depth Estimation.PMIndoor:用于自监督单目室内深度估计的姿态校正网络和多重损失函数
Sensors (Basel). 2023 Oct 30;23(21):8821. doi: 10.3390/s23218821.
9
Unsupervised Monocular Visual Odometry for Fast-Moving Scenes Based on Optical Flow Network with Feature Point Matching Constraint.基于特征点匹配约束的光流网络的快速运动场景无监督单目视觉里程计。
Sensors (Basel). 2022 Dec 9;22(24):9647. doi: 10.3390/s22249647.
10
SelfVIO: Self-supervised deep monocular Visual-Inertial Odometry and depth estimation.SelfVIO:自监督深度单目视觉惯性里程计和深度估计。
Neural Netw. 2022 Jun;150:119-136. doi: 10.1016/j.neunet.2022.03.005. Epub 2022 Mar 10.

本文引用的文献

1
sSLAM: Speeded-Up Visual SLAM Mixing Artificial Markers and Temporary Keypoints.sSLAM:加速视觉 SLAM 混合人工标记和临时关键点。
Sensors (Basel). 2023 Feb 16;23(4):2210. doi: 10.3390/s23042210.
2
Autonomous Vehicle Dataset with Real Multi-Driver Scenes and Biometric Data.具有真实多驾驶员场景和生物识别数据的自动驾驶汽车数据集。
Sensors (Basel). 2023 Feb 10;23(4):2009. doi: 10.3390/s23042009.
3
Geometry Sampling-Based Adaption to DCGAN for 3D Face Generation.基于几何采样的 DCGAN 适配在 3D 人脸生成中的应用。
Sensors (Basel). 2023 Feb 9;23(4):1937. doi: 10.3390/s23041937.
4
ChatGPT: five priorities for research.ChatGPT:研究的五个优先事项。
Nature. 2023 Feb;614(7947):224-226. doi: 10.1038/d41586-023-00288-7.
5
Unsupervised Learning of Monocular Depth and Ego-Motion with Optical Flow Features and Multiple Constraints.基于光流特征和多种约束的单目深度和自身运动的无监督学习。
Sensors (Basel). 2022 Feb 11;22(4):1383. doi: 10.3390/s22041383.
6
Unsupervised Estimation of Monocular Depth and VO in Dynamic Environments via Hybrid Masks.通过混合掩码对动态环境中的单目深度和视觉里程计进行无监督估计。
IEEE Trans Neural Netw Learn Syst. 2022 May;33(5):2023-2033. doi: 10.1109/TNNLS.2021.3100895. Epub 2022 May 2.
7
Robust RGB-D SLAM Using Point and Line Features for Low Textured Scene.基于点线特征的鲁棒RGB-D SLAM用于低纹理场景
Sensors (Basel). 2020 Sep 2;20(17):4984. doi: 10.3390/s20174984.
8
Learning Depth from Single Monocular Images Using Deep Convolutional Neural Fields.利用深度卷积神经场从单目图像中学习深度。
IEEE Trans Pattern Anal Mach Intell. 2016 Oct;38(10):2024-39. doi: 10.1109/TPAMI.2015.2505283. Epub 2015 Dec 3.
9
Image quality assessment: from error visibility to structural similarity.图像质量评估:从误差可见性到结构相似性。
IEEE Trans Image Process. 2004 Apr;13(4):600-12. doi: 10.1109/tip.2003.819861.