• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于坐标注意力门控循环单元融合事件与帧用于单目深度估计

Fusing Events and Frames with Coordinate Attention Gated Recurrent Unit for Monocular Depth Estimation.

作者信息

Duan Huimei, Guo Chenggang, Ou Yuan

机构信息

School of Computer and Software Engineering, Xihua University, Chengdu 610039, China.

出版信息

Sensors (Basel). 2024 Dec 4;24(23):7752. doi: 10.3390/s24237752.

DOI:10.3390/s24237752
PMID:39686289
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11645081/
Abstract

Monocular depth estimation is a central problem in computer vision and robot vision, aiming at obtaining the depth information of a scene from a single image. In some extreme environments such as dynamics or drastic lighting changes, monocular depth estimation methods based on conventional cameras often perform poorly. Event cameras are able to capture brightness changes asynchronously but are not able to acquire color and absolute brightness information. Thus, it is an ideal choice to make full use of the complementary advantages of event cameras and conventional cameras. However, how to effectively fuse event data and frames to improve the accuracy and robustness of monocular depth estimation remains an urgent problem. To overcome these challenges, a novel Coordinate Attention Gated Recurrent Unit (CAGRU) is proposed in this paper. Unlike the conventional ConvGRUs, our CAGRU abandons the conventional practice of using convolutional layers for all the gates and innovatively designs the coordinate attention as an attention gate and combines it with the convolutional gate. Coordinate attention explicitly models inter-channel dependencies and coordinate information in space. The coordinate attention gate in conjunction with the convolutional gate enable the network to model feature information spatially, temporally, and internally across channels. Based on this, the CAGRU can enhance the information density of the sparse events in the spatial domain in the recursive process of temporal information, thereby achieving more effective feature screening and fusion. It can effectively integrate feature information from event cameras and standard cameras, further improving the accuracy and robustness of monocular depth estimation. The experimental results show that the method proposed in this paper achieves significant performance improvements on different public datasets.

摘要

单目深度估计是计算机视觉和机器人视觉中的核心问题,旨在从单张图像中获取场景的深度信息。在一些极端环境中,如动态场景或光照剧烈变化的情况下,基于传统相机的单目深度估计方法往往表现不佳。事件相机能够异步捕捉亮度变化,但无法获取颜色和绝对亮度信息。因此,充分利用事件相机和传统相机的互补优势是一个理想的选择。然而,如何有效地融合事件数据和帧以提高单目深度估计的准确性和鲁棒性仍然是一个亟待解决的问题。为了克服这些挑战,本文提出了一种新颖的坐标注意力门控循环单元(CAGRU)。与传统的卷积门控循环单元不同,我们的CAGRU摒弃了对所有门都使用卷积层的传统做法,创新性地将坐标注意力设计为一个注意力门,并将其与卷积门相结合。坐标注意力显式地对通道间的依赖关系和空间中的坐标信息进行建模。坐标注意力门与卷积门相结合,使网络能够在空间、时间和跨通道内部对特征信息进行建模。基于此,CAGRU可以在时间信息的递归过程中增强空间域中稀疏事件的信息密度,从而实现更有效的特征筛选和融合。它能够有效地整合来自事件相机和标准相机的特征信息,进一步提高单目深度估计的准确性和鲁棒性。实验结果表明,本文提出的方法在不同的公共数据集上取得了显著的性能提升。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820c/11645081/381b19f91961/sensors-24-07752-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820c/11645081/875f965d5a3f/sensors-24-07752-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820c/11645081/a82c37e8fe77/sensors-24-07752-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820c/11645081/d8f6e61a1cb2/sensors-24-07752-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820c/11645081/f47b34f789f2/sensors-24-07752-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820c/11645081/67e0193c4573/sensors-24-07752-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820c/11645081/381b19f91961/sensors-24-07752-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820c/11645081/875f965d5a3f/sensors-24-07752-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820c/11645081/a82c37e8fe77/sensors-24-07752-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820c/11645081/d8f6e61a1cb2/sensors-24-07752-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820c/11645081/f47b34f789f2/sensors-24-07752-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820c/11645081/67e0193c4573/sensors-24-07752-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/820c/11645081/381b19f91961/sensors-24-07752-g006.jpg

相似文献

1
Fusing Events and Frames with Coordinate Attention Gated Recurrent Unit for Monocular Depth Estimation.基于坐标注意力门控循环单元融合事件与帧用于单目深度估计
Sensors (Basel). 2024 Dec 4;24(23):7752. doi: 10.3390/s24237752.
2
Online supervised attention-based recurrent depth estimation from monocular video.基于在线监督注意力机制的单目视频递归深度估计
PeerJ Comput Sci. 2020 Nov 23;6:e317. doi: 10.7717/peerj-cs.317. eCollection 2020.
3
A Robust Monocular and Binocular Visual Ranging Fusion Method Based on an Adaptive UKF.一种基于自适应无迹卡尔曼滤波器的鲁棒单目与双目视觉测距融合方法。
Sensors (Basel). 2024 Jun 27;24(13):4178. doi: 10.3390/s24134178.
4
Joint Soft-Hard Attention for Self-Supervised Monocular Depth Estimation.基于联合软-硬注意力的自监督单目深度估计。
Sensors (Basel). 2021 Oct 20;21(21):6956. doi: 10.3390/s21216956.
5
Monocular Depth Estimation with Joint Attention Feature Distillation and Wavelet-Based Loss Function.基于联合注意特征提取和基于小波的损失函数的单目深度估计。
Sensors (Basel). 2020 Dec 24;21(1):54. doi: 10.3390/s21010054.
6
AMENet is a monocular depth estimation network designed for automatic stereoscopic display.AMENet是一种为自动立体显示而设计的单目深度估计网络。
Sci Rep. 2024 Mar 11;14(1):5868. doi: 10.1038/s41598-024-56095-1.
7
Deep Neural Networks for Accurate Depth Estimation with Latent Space Features.利用潜在空间特征实现精确深度估计的深度神经网络。
Biomimetics (Basel). 2024 Dec 9;9(12):747. doi: 10.3390/biomimetics9120747.
8
Event-based feature tracking in a visual inertial odometry framework.视觉惯性里程计框架中基于事件的特征跟踪
Front Robot AI. 2023 Feb 14;10:994488. doi: 10.3389/frobt.2023.994488. eCollection 2023.
9
Residual Vision Transformer and Adaptive Fusion Autoencoders for Monocular Depth Estimation.用于单目深度估计的残差视觉Transformer和自适应融合自动编码器
Sensors (Basel). 2024 Dec 26;25(1):80. doi: 10.3390/s25010080.
10
SFA-MDEN: Semantic-Feature-Aided Monocular Depth Estimation Network Using Dual Branches.SFA-MDEN:基于语义特征辅助的双通道单目深度估计网络。
Sensors (Basel). 2021 Aug 13;21(16):5476. doi: 10.3390/s21165476.

本文引用的文献

1
EventHDR: From Event to High-Speed HDR Videos and Beyond.EventHDR:从事件到高速HDR视频及其他。
IEEE Trans Pattern Anal Mach Intell. 2025 Jan;47(1):32-50. doi: 10.1109/TPAMI.2024.3469571. Epub 2024 Dec 4.
2
Monocular Depth Estimation Using Deep Learning: A Review.基于深度学习的单目深度估计研究综述。
Sensors (Basel). 2022 Jul 18;22(14):5353. doi: 10.3390/s22145353.
3
Biosignal-based transferable attention Bi-ConvGRU deep network for hand-gesture recognition towards online upper-limb prosthesis control.基于生物信号的可迁移注意力双卷积门控循环单元深度网络用于在线上肢假肢控制的手势识别。
Comput Methods Programs Biomed. 2022 Sep;224:106999. doi: 10.1016/j.cmpb.2022.106999. Epub 2022 Jul 8.
4
Analytical Review of Event-Based Camera Depth Estimation Methods and Systems.基于事件相机的深度估计方法与系统分析综述
Sensors (Basel). 2022 Feb 5;22(3):1201. doi: 10.3390/s22031201.
5
DGSLSTM: Deep Gated Stacked Long Short-Term Memory Neural Network for Traffic Flow Forecasting of Transportation Networks on Big Data Environment.DGSLSTM:用于大数据环境下交通网络交通流预测的深度门控堆叠长短期记忆神经网络
Big Data. 2024 Dec;12(6):504-517. doi: 10.1089/big.2021.0013. Epub 2022 Feb 10.
6
Learning sensorimotor control with neuromorphic sensors: Toward hyperdimensional active perception.用神经形态传感器学习感觉运动控制:迈向超高维主动感知。
Sci Robot. 2019 May 15;4(30). doi: 10.1126/scirobotics.aaw6736.
7
Dynamic obstacle avoidance for quadrotors with event cameras.四旋翼飞行器的事件相机动态避障。
Sci Robot. 2020 Mar 18;5(40). doi: 10.1126/scirobotics.aaz9712.
8
Event-Based Vision: A Survey.基于事件的视觉:综述。
IEEE Trans Pattern Anal Mach Intell. 2022 Jan;44(1):154-180. doi: 10.1109/TPAMI.2020.3008413. Epub 2021 Dec 7.
9
High Speed and High Dynamic Range Video with an Event Camera.基于事件相机的高速高动态范围视频
IEEE Trans Pattern Anal Mach Intell. 2021 Jun;43(6):1964-1980. doi: 10.1109/TPAMI.2019.2963386. Epub 2021 May 11.
10
Deep Monocular Depth Estimation via Integration of Global and Local Predictions.通过全局和局部预测融合实现深度单目深度估计
IEEE Trans Image Process. 2018 May 15. doi: 10.1109/TIP.2018.2836318.