深度学习技术在视频预测中的研究综述

A Review on Deep Learning Techniques for Video Prediction.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2022 Jun;44(6):2806-2826. doi: 10.1109/TPAMI.2020.3045007. Epub 2022 May 5.

DOI:10.1109/TPAMI.2020.3045007

PMID:33320810

Abstract

The ability to predict, anticipate and reason about future outcomes is a key component of intelligent decision-making systems. In light of the success of deep learning in computer vision, deep-learning-based video prediction emerged as a promising research direction. Defined as a self-supervised learning task, video prediction represents a suitable framework for representation learning, as it demonstrated potential capabilities for extracting meaningful representations of the underlying patterns in natural videos. Motivated by the increasing interest in this task, we provide a review on the deep learning methods for prediction in video sequences. We first define the video prediction fundamentals, as well as mandatory background concepts and the most used datasets. Next, we carefully analyze existing video prediction models organized according to a proposed taxonomy, highlighting their contributions and their significance in the field. The summary of the datasets and methods is accompanied with experimental results that facilitate the assessment of the state of the art on a quantitative basis. The paper is summarized by drawing some general conclusions, identifying open research challenges and by pointing out future research directions.

摘要

预测、预见和推理未来结果的能力是智能决策系统的关键组成部分。鉴于深度学习在计算机视觉方面的成功，基于深度学习的视频预测成为一个有前途的研究方向。视频预测被定义为一种自监督学习任务，它为表示学习提供了一个合适的框架，因为它展示了从自然视频中提取底层模式有意义表示的潜力。鉴于人们对这项任务的兴趣日益浓厚，我们对视频序列预测的深度学习方法进行了综述。我们首先定义视频预测的基础知识，以及必要的背景概念和最常用的数据集。接下来，我们根据提出的分类法仔细分析现有的视频预测模型，突出它们的贡献及其在该领域的重要性。数据集和方法的总结附有实验结果，便于在定量基础上评估最新技术水平。本文通过总结一些一般性结论、确定开放的研究挑战并指出未来的研究方向来结束。

相似文献

A Review on Deep Learning Techniques for Video Prediction.

IEEE Trans Pattern Anal Mach Intell. 2022 Jun;44(6):2806-2826. doi: 10.1109/TPAMI.2020.3045007. Epub 2022 May 5.

Deep learning approaches for seizure video analysis: A review.

Epilepsy Behav. 2024 May;154:109735. doi: 10.1016/j.yebeh.2024.109735. Epub 2024 Mar 23.

Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey.

IEEE Trans Pattern Anal Mach Intell. 2021 Nov;43(11):4037-4058. doi: 10.1109/TPAMI.2020.2992393. Epub 2021 Oct 1.

Deep Learning Driven Visual Path Prediction From a Single Image.

IEEE Trans Image Process. 2016 Dec;25(12):5892-5904. doi: 10.1109/TIP.2016.2613686. Epub 2016 Sep 26.

Deep learning algorithms applied to computational chemistry.

Mol Divers. 2024 Aug;28(4):2375-2410. doi: 10.1007/s11030-023-10771-y. Epub 2023 Dec 27.

Biomedical Data and Deep Learning Computational Models for Predicting Compound-Protein Relations.

IEEE/ACM Trans Comput Biol Bioinform. 2022 Jul-Aug;19(4):2092-2110. doi: 10.1109/TCBB.2021.3069040. Epub 2022 Aug 8.

3-D PersonVLAD: Learning Deep Global Representations for Video-Based Person Reidentification.

IEEE Trans Neural Netw Learn Syst. 2019 Nov;30(11):3347-3359. doi: 10.1109/TNNLS.2019.2891244. Epub 2019 Feb 1.

A Review of Deep Learning-Based Methods for Pedestrian Trajectory Prediction.

Sensors (Basel). 2021 Nov 13;21(22):7543. doi: 10.3390/s21227543.

Revisiting Video Saliency Prediction in the Deep Learning Era.

IEEE Trans Pattern Anal Mach Intell. 2021 Jan;43(1):220-237. doi: 10.1109/TPAMI.2019.2924417. Epub 2020 Dec 4.

Self-Supervised Representation Learning for Ultrasound Video.

Proc IEEE Int Symp Biomed Imaging. 2020 Apr 3;2020:1847-1850. doi: 10.1109/ISBI45749.2020.9098666.

引用本文的文献

Deep learning methods to forecasting human embryo development in time-lapse videos.

PLoS One. 2025 Sep 2;20(9):e0330924. doi: 10.1371/journal.pone.0330924. eCollection 2025.

Brain-like border ownership signals support prediction of natural videos.

iScience. 2025 Mar 11;28(4):112199. doi: 10.1016/j.isci.2025.112199. eCollection 2025 Apr 18.

Decoding viewer emotions in video ads.

Sci Rep. 2024 Nov 2;14(1):26382. doi: 10.1038/s41598-024-76968-9.

Bridging vision and touch: advancing robotic interaction prediction with self-supervised multimodal learning.

Front Robot AI. 2024 Sep 30;11:1407519. doi: 10.3389/frobt.2024.1407519. eCollection 2024.

Brain-like border ownership signals support prediction of natural videos.

bioRxiv. 2024 Aug 12:2024.08.11.607040. doi: 10.1101/2024.08.11.607040.

Diffusion Probabilistic Modeling for Video Generation.

Entropy (Basel). 2023 Oct 20;25(10):1469. doi: 10.3390/e25101469.

Video frame prediction of microbial growth with a recurrent neural network.

Front Microbiol. 2023 Jan 5;13:1034586. doi: 10.3389/fmicb.2022.1034586. eCollection 2022.

TRUST: A Novel Framework for Vehicle Trajectory Recovery from Urban-Scale Videos.

Sensors (Basel). 2022 Dec 16;22(24):9948. doi: 10.3390/s22249948.

Recent Advances in Artificial Intelligence and Tactical Autonomy: Current Status, Challenges, and Perspectives.

Sensors (Basel). 2022 Dec 16;22(24):9916. doi: 10.3390/s22249916.

A texture-aware U-Net for identifying incomplete blinking from eye videography.

Biomed Signal Process Control. 2022 May;75. doi: 10.1016/j.bspc.2022.103630. Epub 2022 Mar 16.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

深度学习技术在视频预测中的研究综述

A Review on Deep Learning Techniques for Video Prediction.

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献