Suppr超能文献

动态场景中物体和自我运动估计的稀疏表示

Sparse Representations for Object- and Ego-Motion Estimations in Dynamic Scenes.

作者信息

Kashyap Hirak J, Fowlkes Charless C, Krichmar Jeffrey L

出版信息

IEEE Trans Neural Netw Learn Syst. 2021 Jun;32(6):2521-2534. doi: 10.1109/TNNLS.2020.3006467. Epub 2021 Jun 2.

Abstract

Disentangling the sources of visual motion in a dynamic scene during self-movement or ego motion is important for autonomous navigation and tracking. In the dynamic image segments of a video frame containing independently moving objects, optic flow relative to the next frame is the sum of the motion fields generated due to camera and object motion. The traditional ego-motion estimation methods assume the scene to be static, and the recent deep learning-based methods do not separate pixel velocities into object- and ego-motion components. We propose a learning-based approach to predict both ego-motion parameters and object-motion field (OMF) from image sequences using a convolutional autoencoder while being robust to variations due to the unconstrained scene depth. This is achieved by: 1) training with continuous ego-motion constraints that allow solving for ego-motion parameters independently of depth and 2) learning a sparsely activated overcomplete ego-motion field (EMF) basis set, which eliminates the irrelevant components in both static and dynamic segments for the task of ego-motion estimation. In order to learn the EMF basis set, we propose a new differentiable sparsity penalty function that approximates the number of nonzero activations in the bottleneck layer of the autoencoder and enforces sparsity more effectively than L1- and L2-norm-based penalties. Unlike the existing direct ego-motion estimation methods, the predicted global EMF can be used to extract OMF directly by comparing it against the optic flow. Compared with the state-of-the-art baselines, the proposed model performs favorably on pixelwise object- and ego-motion estimation tasks when evaluated on real and synthetic data sets of dynamic scenes.

摘要

在自我运动或自身运动过程中,解析动态场景中视觉运动的来源对于自主导航和跟踪至关重要。在包含独立移动对象的视频帧的动态图像片段中,相对于下一帧的光流是由于相机和对象运动产生的运动场之和。传统的自身运动估计方法假定场景是静态的,而最近基于深度学习的方法并未将像素速度分离为对象运动和自身运动分量。我们提出一种基于学习的方法,使用卷积自动编码器从图像序列中预测自身运动参数和对象运动场(OMF),同时对由于无约束场景深度导致的变化具有鲁棒性。这通过以下方式实现:1)使用连续自身运动约束进行训练,该约束允许独立于深度求解自身运动参数;2)学习稀疏激活的超完备自身运动场(EMF)基集,该基集消除了静态和动态片段中与自身运动估计任务无关的分量。为了学习EMF基集,我们提出了一种新的可微稀疏惩罚函数,该函数近似自动编码器瓶颈层中非零激活的数量,并且比基于L1和L2范数的惩罚更有效地强制稀疏性。与现有的直接自身运动估计方法不同,预测的全局EMF可通过与光流进行比较直接用于提取OMF。与最先进的基线相比,在动态场景的真实和合成数据集上进行评估时,所提出的模型在逐像素对象和自身运动估计任务上表现良好。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验