Suppr超能文献

基于单张图像的深度学习驱动视觉路径预测

Deep Learning Driven Visual Path Prediction From a Single Image.

出版信息

IEEE Trans Image Process. 2016 Dec;25(12):5892-5904. doi: 10.1109/TIP.2016.2613686. Epub 2016 Sep 26.

Abstract

Capabilities of inference and prediction are the significant components of visual systems. Visual path prediction is an important and challenging task among them, with the goal to infer the future path of a visual object in a static scene. This task is complicated as it needs high-level semantic understandings of both the scenes and underlying motion patterns in video sequences. In practice, cluttered situations have also raised higher demands on the effectiveness and robustness of models. Motivated by these observations, we propose a deep learning framework, which simultaneously performs deep feature learning for visual representation in conjunction with spatiotemporal context modeling. After that, a unified path-planning scheme is proposed to make accurate path prediction based on the analytic results returned by the deep context models. The highly effective visual representation and deep context models ensure that our framework makes a deep semantic understanding of the scenes and motion patterns, consequently improving the performance on visual path prediction task. In experiments, we extensively evaluate the model's performance by constructing two large benchmark datasets from the adaptation of video tracking datasets. The qualitative and quantitative experimental results show that our approach outperforms the state-of-the-art approaches and owns a better generalization capability.

摘要

推理和预测能力是视觉系统的重要组成部分。视觉路径预测是其中一项重要且具有挑战性的任务,其目标是推断静态场景中视觉对象的未来路径。这项任务很复杂,因为它需要对视频序列中的场景和潜在运动模式有高级语义理解。在实际应用中,杂乱的场景也对模型的有效性和鲁棒性提出了更高要求。受这些观察结果的启发,我们提出了一个深度学习框架,该框架同时结合时空上下文建模进行视觉表示的深度特征学习。在此之后,提出了一种统一的路径规划方案,以基于深度上下文模型返回的分析结果进行准确的路径预测。高效的视觉表示和深度上下文模型确保我们的框架对场景和运动模式有深入的语义理解,从而提高视觉路径预测任务的性能。在实验中,我们通过改编视频跟踪数据集构建了两个大型基准数据集,广泛评估了模型的性能。定性和定量实验结果表明,我们的方法优于现有方法,并且具有更好的泛化能力。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验