• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于注意力机制的自监督循环深度估计

Self-supervised recurrent depth estimation with attention mechanisms.

作者信息

Makarov Ilya, Bakhanova Maria, Nikolenko Sergey, Gerasimova Olga

机构信息

HSE University, Moscow, Russia.

Artificial Intelligence Research Institute (AIRI), Moscow, Russia.

出版信息

PeerJ Comput Sci. 2022 Jan 31;8:e865. doi: 10.7717/peerj-cs.865. eCollection 2022.

DOI:10.7717/peerj-cs.865
PMID:35494794
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9044223/
Abstract

Depth estimation has been an essential task for many computer vision applications, especially in autonomous driving, where safety is paramount. Depth can be estimated not only with traditional supervised learning but also via a self-supervised approach that relies on camera motion and does not require ground truth depth maps. Recently, major improvements have been introduced to make self-supervised depth prediction more precise. However, most existing approaches still focus on single-frame depth estimation, even in the self-supervised setting. Since most methods can operate with frame sequences, we believe that the quality of current models can be significantly improved with the help of information about previous frames. In this work, we study different ways of integrating recurrent blocks and attention mechanisms into a common self-supervised depth estimation pipeline. We propose a set of modifications that utilize temporal information from previous frames and provide new neural network architectures for monocular depth estimation in a self-supervised manner. Our experiments on the KITTI dataset show that proposed modifications can be an effective tool for exploiting temporal information in a depth prediction pipeline.

摘要

深度估计一直是许多计算机视觉应用中的一项重要任务,尤其是在自动驾驶领域,安全至关重要。深度不仅可以通过传统的监督学习来估计,还可以通过一种自监督方法来估计,该方法依赖于相机运动,并且不需要真实的深度图。最近,已经引入了重大改进,以使自监督深度预测更加精确。然而,即使在自监督设置中,大多数现有方法仍然专注于单帧深度估计。由于大多数方法可以处理帧序列,我们相信借助先前帧的信息可以显著提高当前模型的质量。在这项工作中,我们研究了将循环块和注意力机制集成到通用自监督深度估计管道中的不同方法。我们提出了一组修改,利用来自先前帧的时间信息,并以自监督的方式为单目深度估计提供新的神经网络架构。我们在KITTI数据集上的实验表明,所提出的修改可以成为在深度预测管道中利用时间信息的有效工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ffe/9044223/997906354ec9/peerj-cs-08-865-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ffe/9044223/b7a708890f04/peerj-cs-08-865-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ffe/9044223/a72e6aff317e/peerj-cs-08-865-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ffe/9044223/cc2494a9a7a5/peerj-cs-08-865-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ffe/9044223/c836e19f0a3b/peerj-cs-08-865-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ffe/9044223/15652c4b421c/peerj-cs-08-865-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ffe/9044223/01c7a3466e69/peerj-cs-08-865-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ffe/9044223/f741bd867f13/peerj-cs-08-865-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ffe/9044223/997906354ec9/peerj-cs-08-865-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ffe/9044223/b7a708890f04/peerj-cs-08-865-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ffe/9044223/a72e6aff317e/peerj-cs-08-865-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ffe/9044223/cc2494a9a7a5/peerj-cs-08-865-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ffe/9044223/c836e19f0a3b/peerj-cs-08-865-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ffe/9044223/15652c4b421c/peerj-cs-08-865-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ffe/9044223/01c7a3466e69/peerj-cs-08-865-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ffe/9044223/f741bd867f13/peerj-cs-08-865-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ffe/9044223/997906354ec9/peerj-cs-08-865-g008.jpg

相似文献

1
Self-supervised recurrent depth estimation with attention mechanisms.基于注意力机制的自监督循环深度估计
PeerJ Comput Sci. 2022 Jan 31;8:e865. doi: 10.7717/peerj-cs.865. eCollection 2022.
2
Online supervised attention-based recurrent depth estimation from monocular video.基于在线监督注意力机制的单目视频递归深度估计
PeerJ Comput Sci. 2020 Nov 23;6:e317. doi: 10.7717/peerj-cs.317. eCollection 2020.
3
Self-supervised Monocular Depth Estimation with 3D Displacement Module for Laparoscopic Images.用于腹腔镜图像的基于3D位移模块的自监督单目深度估计
IEEE Trans Med Robot Bionics. 2022 May;4(2):331-334. doi: 10.1109/TMRB.2022.3170206.
4
Cycle-SfM: Joint self-supervised learning of depth and camera motion from monocular image sequences.循环 SfM:基于单目图像序列的深度和相机运动联合自监督学习。
Chaos. 2019 Dec;29(12):123102. doi: 10.1063/1.5120605.
5
Joint Soft-Hard Attention for Self-Supervised Monocular Depth Estimation.基于联合软-硬注意力的自监督单目深度估计。
Sensors (Basel). 2021 Oct 20;21(21):6956. doi: 10.3390/s21216956.
6
SENSE: Self-Evolving Learning for Self-Supervised Monocular Depth Estimation.SENSE:用于自监督单目深度估计的自进化学习
IEEE Trans Image Process. 2024;33:439-450. doi: 10.1109/TIP.2023.3338053. Epub 2023 Dec 29.
7
Monocular Depth Estimation with Self-Supervised Learning for Vineyard Unmanned Agricultural Vehicle.基于自监督学习的葡萄园无人农业车单目深度估计
Sensors (Basel). 2022 Jan 18;22(3):721. doi: 10.3390/s22030721.
8
Self-Supervised Object Distance Estimation Using a Monocular Camera.基于单目相机的自监督目标距离估计
Sensors (Basel). 2022 Apr 12;22(8):2936. doi: 10.3390/s22082936.
9
MLDA-Net: Multi-Level Dual Attention-Based Network for Self-Supervised Monocular Depth Estimation.MLDA-Net:用于自监督单目深度估计的基于多级双重注意力的网络。
IEEE Trans Image Process. 2021;30:4691-4705. doi: 10.1109/TIP.2021.3074306. Epub 2021 May 3.
10
SelfVIO: Self-supervised deep monocular Visual-Inertial Odometry and depth estimation.SelfVIO:自监督深度单目视觉惯性里程计和深度估计。
Neural Netw. 2022 Jun;150:119-136. doi: 10.1016/j.neunet.2022.03.005. Epub 2022 Mar 10.

引用本文的文献

1
Sterilization of image steganography using self-supervised convolutional neural network.使用自监督卷积神经网络的图像隐写术加密
PeerJ Comput Sci. 2024 Sep 24;10:e2330. doi: 10.7717/peerj-cs.2330. eCollection 2024.
2
Polarimetric Imaging for Robot Perception: A Review.用于机器人感知的偏振成像:综述
Sensors (Basel). 2024 Jul 9;24(14):4440. doi: 10.3390/s24144440.
3
Monocular Depth Estimation Using Deep Learning: A Review.基于深度学习的单目深度估计研究综述。

本文引用的文献

1
NeuralRecon: Real-Time Coherent 3D Scene Reconstruction From Monocular Video.神经重建:基于单目视频的实时连贯三维场景重建
IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):7542-7555. doi: 10.1109/TPAMI.2024.3393141. Epub 2024 Nov 6.
2
Virtual Normal: Enforcing Geometric Constraints for Accurate and Robust Depth Prediction.虚拟法线:为准确且稳健的深度预测实施几何约束
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):7282-7295. doi: 10.1109/TPAMI.2021.3097396. Epub 2022 Sep 14.
3
Online supervised attention-based recurrent depth estimation from monocular video.
Sensors (Basel). 2022 Jul 18;22(14):5353. doi: 10.3390/s22145353.
基于在线监督注意力机制的单目视频递归深度估计
PeerJ Comput Sci. 2020 Nov 23;6:e317. doi: 10.7717/peerj-cs.317. eCollection 2020.
4
Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding.每一个像素都很重要++:通过3D整体理解进行几何与运动的联合学习。
IEEE Trans Pattern Anal Mach Intell. 2019 Jul 23. doi: 10.1109/TPAMI.2019.2930258.
5
Deep Ordinal Regression Network for Monocular Depth Estimation.用于单目深度估计的深度序数回归网络
Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. 2018 Jun;2018:2002-2011. doi: 10.1109/CVPR.2018.00214. Epub 2018 Dec 17.
6
RefineNet: Multi-Path Refinement Networks for Dense Prediction.RefineNet: 用于密集预测的多路径细化网络。
IEEE Trans Pattern Anal Mach Intell. 2020 May;42(5):1228-1242. doi: 10.1109/TPAMI.2019.2893630. Epub 2019 Jan 18.
7
Image quality assessment: from error visibility to structural similarity.图像质量评估:从误差可见性到结构相似性。
IEEE Trans Image Process. 2004 Apr;13(4):600-12. doi: 10.1109/tip.2003.819861.
8
Learning to forget: continual prediction with LSTM.学习遗忘:使用长短期记忆网络进行持续预测。
Neural Comput. 2000 Oct;12(10):2451-71. doi: 10.1162/089976600300015015.
9
Long short-term memory.长短期记忆
Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.