Wu Xinheng, Lu Jie, Yan Zheng, Zhang Guangquan
IEEE Trans Neural Netw Learn Syst. 2024 Nov;35(11):15427-15441. doi: 10.1109/TNNLS.2023.3286890. Epub 2024 Oct 29.
Unsupervised video prediction aims to predict future outcomes based on the observed video frames, thus removing the need for supervisory annotations. This research task has been argued as a key component of intelligent decision-making systems, as it presents the potential capacities of modeling the underlying patterns of videos. Essentially, the challenge of video prediction is to effectively model the complex spatiotemporal and often uncertain dynamics of high-dimensional video data. In this context, an appealing way of modeling spatiotemporal dynamics is to explore prior physical knowledge, such as partial differential equations (PDEs). In this article, considering real-world video data as a partly observed stochastic environment, we introduce a new stochastic PDE predictor (SPDE-predictor), which models the spatiotemporal dynamics by approximating a generalized form of PDEs while dealing with the stochasticity. A second contribution is that we disentangle the high-dimensional video prediction into low-level dimensional factors of variations: time-varying stochastic PDE dynamics and time-invariant content factors. Extensive experiments on four various video datasets show that SPDE video prediction model (SPDE-VP) outperforms both deterministic and stochastic state-of-the-art methods. Ablation studies highlight our superiority driven by both PDE dynamics modeling and disentangled representation learning and their relevance in long-term video prediction.
无监督视频预测旨在根据观察到的视频帧预测未来结果,从而无需监督标注。这项研究任务被认为是智能决策系统的关键组成部分,因为它展现了对视频潜在模式进行建模的能力。本质上,视频预测的挑战在于有效建模高维视频数据复杂的时空动态以及通常存在的不确定性。在这种背景下,一种对时空动态进行建模的有吸引力的方法是探索先验物理知识,例如偏微分方程(PDE)。在本文中,将现实世界的视频数据视为部分可观察的随机环境,我们引入了一种新的随机偏微分方程预测器(SPDE预测器),它在处理随机性的同时,通过近似广义形式的偏微分方程对时空动态进行建模。第二个贡献是,我们将高维视频预测分解为低维变化因素:时变随机偏微分方程动态和时不变内容因素。在四个不同视频数据集上进行的大量实验表明,SPDE视频预测模型(SPDE-VP)优于确定性和随机的现有先进方法。消融研究突出了我们在偏微分方程动态建模和解缠表示学习方面的优势,以及它们在长期视频预测中的相关性。