动态场景中物体和自我运动估计的稀疏表示

Sparse Representations for Object- and Ego-Motion Estimations in Dynamic Scenes.

作者信息

Kashyap Hirak J, Fowlkes Charless C, Krichmar Jeffrey L

出版信息

IEEE Trans Neural Netw Learn Syst. 2021 Jun;32(6):2521-2534. doi: 10.1109/TNNLS.2020.3006467. Epub 2021 Jun 2.

DOI:10.1109/TNNLS.2020.3006467

Abstract

Disentangling the sources of visual motion in a dynamic scene during self-movement or ego motion is important for autonomous navigation and tracking. In the dynamic image segments of a video frame containing independently moving objects, optic flow relative to the next frame is the sum of the motion fields generated due to camera and object motion. The traditional ego-motion estimation methods assume the scene to be static, and the recent deep learning-based methods do not separate pixel velocities into object- and ego-motion components. We propose a learning-based approach to predict both ego-motion parameters and object-motion field (OMF) from image sequences using a convolutional autoencoder while being robust to variations due to the unconstrained scene depth. This is achieved by: 1) training with continuous ego-motion constraints that allow solving for ego-motion parameters independently of depth and 2) learning a sparsely activated overcomplete ego-motion field (EMF) basis set, which eliminates the irrelevant components in both static and dynamic segments for the task of ego-motion estimation. In order to learn the EMF basis set, we propose a new differentiable sparsity penalty function that approximates the number of nonzero activations in the bottleneck layer of the autoencoder and enforces sparsity more effectively than L1- and L2-norm-based penalties. Unlike the existing direct ego-motion estimation methods, the predicted global EMF can be used to extract OMF directly by comparing it against the optic flow. Compared with the state-of-the-art baselines, the proposed model performs favorably on pixelwise object- and ego-motion estimation tasks when evaluated on real and synthetic data sets of dynamic scenes.

摘要

在自我运动或自身运动过程中，解析动态场景中视觉运动的来源对于自主导航和跟踪至关重要。在包含独立移动对象的视频帧的动态图像片段中，相对于下一帧的光流是由于相机和对象运动产生的运动场之和。传统的自身运动估计方法假定场景是静态的，而最近基于深度学习的方法并未将像素速度分离为对象运动和自身运动分量。我们提出一种基于学习的方法，使用卷积自动编码器从图像序列中预测自身运动参数和对象运动场（OMF），同时对由于无约束场景深度导致的变化具有鲁棒性。这通过以下方式实现：1）使用连续自身运动约束进行训练，该约束允许独立于深度求解自身运动参数；2）学习稀疏激活的超完备自身运动场（EMF）基集，该基集消除了静态和动态片段中与自身运动估计任务无关的分量。为了学习EMF基集，我们提出了一种新的可微稀疏惩罚函数，该函数近似自动编码器瓶颈层中非零激活的数量，并且比基于L1和L2范数的惩罚更有效地强制稀疏性。与现有的直接自身运动估计方法不同，预测的全局EMF可通过与光流进行比较直接用于提取OMF。与最先进的基线相比，在动态场景的真实和合成数据集上进行评估时，所提出的模型在逐像素对象和自身运动估计任务上表现良好。

相似文献

Sparse Representations for Object- and Ego-Motion Estimations in Dynamic Scenes.动态场景中物体和自我运动估计的稀疏表示

IEEE Trans Neural Netw Learn Syst. 2021 Jun;32(6):2521-2534. doi: 10.1109/TNNLS.2020.3006467. Epub 2021 Jun 2.

Adaptive Absolute Ego-Motion Estimation Using Wearable Visual-Inertial Sensors for Indoor Positioning.使用可穿戴视觉惯性传感器进行室内定位的自适应绝对自我运动估计

Micromachines (Basel). 2018 Mar 6;9(3):113. doi: 10.3390/mi9030113.

Motion stereo using ego-motion complex logarithmic mapping.利用自身运动复对数映射进行运动立体视觉

IEEE Trans Pattern Anal Mach Intell. 1987 Mar;9(3):356-69. doi: 10.1109/tpami.1987.4767919.

Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding.每一个像素都很重要++：通过3D整体理解进行几何与运动的联合学习。

IEEE Trans Pattern Anal Mach Intell. 2019 Jul 23. doi: 10.1109/TPAMI.2019.2930258.

Joint Stereo Video Deblurring, Scene Flow Estimation and Moving Object Segmentation.联合立体视频去模糊、场景流估计与运动目标分割

IEEE Trans Image Process. 2019 Oct 11. doi: 10.1109/TIP.2019.2945867.

Unsupervised Learning of Monocular Depth and Ego-Motion with Optical Flow Features and Multiple Constraints.基于光流特征和多种约束的单目深度和自身运动的无监督学习。

Sensors (Basel). 2022 Feb 11;22(4):1383. doi: 10.3390/s22041383.

DOT-SLAM: A Stereo Visual Simultaneous Localization and Mapping (SLAM) System with Dynamic Object Tracking Based on Graph Optimization.DOT-SLAM：一种基于图优化的具有动态目标跟踪功能的立体视觉同步定位与地图构建（SLAM）系统。

Sensors (Basel). 2024 Jul 18;24(14):4676. doi: 10.3390/s24144676.

Self-Supervised monocular depth and ego-Motion estimation in endoscopy: Appearance flow to the rescue.内窥镜中单目深度和自我运动估计的自监督学习：外观流来救援。

Med Image Anal. 2022 Apr;77:102338. doi: 10.1016/j.media.2021.102338. Epub 2021 Dec 25.

Variation in the local motion statistics of real-life optic flow scenes.真实视流场景中局部运动统计的变化。

Neural Comput. 2012 Jul;24(7):1781-805. doi: 10.1162/NECO_a_00294. Epub 2012 Mar 19.

Stereovision-Based Ego-Motion Estimation for Combine Harvesters.基于立体视觉的联合收割机自运动估计。

Sensors (Basel). 2022 Aug 25;22(17):6394. doi: 10.3390/s22176394.

引用本文的文献

Deep Neural Networks for Accurate Depth Estimation with Latent Space Features.利用潜在空间特征实现精确深度估计的深度神经网络。

Biomimetics (Basel). 2024 Dec 9;9(12):747. doi: 10.3390/biomimetics9120747.

ReLU, Sparseness, and the Encoding of Optic Flow in Neural Networks.ReLU、稀疏性与神经网络中光流的编码

Sensors (Basel). 2024 Nov 22;24(23):7453. doi: 10.3390/s24237453.

Accuracy optimized neural networks do not effectively model optic flow tuning in brain area MSTd.精度优化的神经网络无法有效地模拟脑区MSTd中的光流调谐。

Front Neurosci. 2024 Sep 2;18:1441285. doi: 10.3389/fnins.2024.1441285. eCollection 2024.

Canonical circuit computations for computer vision.计算机视觉的规范电路计算。

Biol Cybern. 2023 Oct;117(4-5):299-329. doi: 10.1007/s00422-023-00966-9. Epub 2023 Jun 12.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

动态场景中物体和自我运动估计的稀疏表示

Sparse Representations for Object- and Ego-Motion Estimations in Dynamic Scenes.

作者信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献