Suppr超能文献

用于无监督视频预测的随机偏微分方程动力学解析

Disentangling Stochastic PDE Dynamics for Unsupervised Video Prediction.

作者信息

Wu Xinheng, Lu Jie, Yan Zheng, Zhang Guangquan

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Nov;35(11):15427-15441. doi: 10.1109/TNNLS.2023.3286890. Epub 2024 Oct 29.

Abstract

Unsupervised video prediction aims to predict future outcomes based on the observed video frames, thus removing the need for supervisory annotations. This research task has been argued as a key component of intelligent decision-making systems, as it presents the potential capacities of modeling the underlying patterns of videos. Essentially, the challenge of video prediction is to effectively model the complex spatiotemporal and often uncertain dynamics of high-dimensional video data. In this context, an appealing way of modeling spatiotemporal dynamics is to explore prior physical knowledge, such as partial differential equations (PDEs). In this article, considering real-world video data as a partly observed stochastic environment, we introduce a new stochastic PDE predictor (SPDE-predictor), which models the spatiotemporal dynamics by approximating a generalized form of PDEs while dealing with the stochasticity. A second contribution is that we disentangle the high-dimensional video prediction into low-level dimensional factors of variations: time-varying stochastic PDE dynamics and time-invariant content factors. Extensive experiments on four various video datasets show that SPDE video prediction model (SPDE-VP) outperforms both deterministic and stochastic state-of-the-art methods. Ablation studies highlight our superiority driven by both PDE dynamics modeling and disentangled representation learning and their relevance in long-term video prediction.

摘要

无监督视频预测旨在根据观察到的视频帧预测未来结果,从而无需监督标注。这项研究任务被认为是智能决策系统的关键组成部分,因为它展现了对视频潜在模式进行建模的能力。本质上,视频预测的挑战在于有效建模高维视频数据复杂的时空动态以及通常存在的不确定性。在这种背景下,一种对时空动态进行建模的有吸引力的方法是探索先验物理知识,例如偏微分方程(PDE)。在本文中,将现实世界的视频数据视为部分可观察的随机环境,我们引入了一种新的随机偏微分方程预测器(SPDE预测器),它在处理随机性的同时,通过近似广义形式的偏微分方程对时空动态进行建模。第二个贡献是,我们将高维视频预测分解为低维变化因素:时变随机偏微分方程动态和时不变内容因素。在四个不同视频数据集上进行的大量实验表明,SPDE视频预测模型(SPDE-VP)优于确定性和随机的现有先进方法。消融研究突出了我们在偏微分方程动态建模和解缠表示学习方面的优势,以及它们在长期视频预测中的相关性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验