用于无监督视频预测的随机偏微分方程动力学解析

Disentangling Stochastic PDE Dynamics for Unsupervised Video Prediction.

作者信息

Wu Xinheng, Lu Jie, Yan Zheng, Zhang Guangquan

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Nov;35(11):15427-15441. doi: 10.1109/TNNLS.2023.3286890. Epub 2024 Oct 29.

DOI:10.1109/TNNLS.2023.3286890

Abstract

Unsupervised video prediction aims to predict future outcomes based on the observed video frames, thus removing the need for supervisory annotations. This research task has been argued as a key component of intelligent decision-making systems, as it presents the potential capacities of modeling the underlying patterns of videos. Essentially, the challenge of video prediction is to effectively model the complex spatiotemporal and often uncertain dynamics of high-dimensional video data. In this context, an appealing way of modeling spatiotemporal dynamics is to explore prior physical knowledge, such as partial differential equations (PDEs). In this article, considering real-world video data as a partly observed stochastic environment, we introduce a new stochastic PDE predictor (SPDE-predictor), which models the spatiotemporal dynamics by approximating a generalized form of PDEs while dealing with the stochasticity. A second contribution is that we disentangle the high-dimensional video prediction into low-level dimensional factors of variations: time-varying stochastic PDE dynamics and time-invariant content factors. Extensive experiments on four various video datasets show that SPDE video prediction model (SPDE-VP) outperforms both deterministic and stochastic state-of-the-art methods. Ablation studies highlight our superiority driven by both PDE dynamics modeling and disentangled representation learning and their relevance in long-term video prediction.

摘要

无监督视频预测旨在根据观察到的视频帧预测未来结果，从而无需监督标注。这项研究任务被认为是智能决策系统的关键组成部分，因为它展现了对视频潜在模式进行建模的能力。本质上，视频预测的挑战在于有效建模高维视频数据复杂的时空动态以及通常存在的不确定性。在这种背景下，一种对时空动态进行建模的有吸引力的方法是探索先验物理知识，例如偏微分方程（PDE）。在本文中，将现实世界的视频数据视为部分可观察的随机环境，我们引入了一种新的随机偏微分方程预测器（SPDE预测器），它在处理随机性的同时，通过近似广义形式的偏微分方程对时空动态进行建模。第二个贡献是，我们将高维视频预测分解为低维变化因素：时变随机偏微分方程动态和时不变内容因素。在四个不同视频数据集上进行的大量实验表明，SPDE视频预测模型（SPDE-VP）优于确定性和随机的现有先进方法。消融研究突出了我们在偏微分方程动态建模和解缠表示学习方面的优势，以及它们在长期视频预测中的相关性。

相似文献

Disentangling Stochastic PDE Dynamics for Unsupervised Video Prediction.用于无监督视频预测的随机偏微分方程动力学解析

IEEE Trans Neural Netw Learn Syst. 2024 Nov;35(11):15427-15441. doi: 10.1109/TNNLS.2023.3286890. Epub 2024 Oct 29.

Facilitating interaction between partial differential equation-based dynamics and unknown dynamics for regional wind speed prediction.促进基于偏微分方程的动力学与未知动力学之间的相互作用以进行区域风速预测。

Neural Netw. 2024 Jun;174:106233. doi: 10.1016/j.neunet.2024.106233. Epub 2024 Mar 11.

ModeRNN: Harnessing Spatiotemporal Mode Collapse in Unsupervised Predictive Learning.模式循环神经网络：在无监督预测学习中利用时空模式坍缩

IEEE Trans Pattern Anal Mach Intell. 2023 Nov;45(11):13281-13296. doi: 10.1109/TPAMI.2023.3293145. Epub 2023 Oct 3.

A Review on Deep Learning Techniques for Video Prediction.深度学习技术在视频预测中的研究综述

IEEE Trans Pattern Anal Mach Intell. 2022 Jun;44(6):2806-2826. doi: 10.1109/TPAMI.2020.3045007. Epub 2022 May 5.

WDMNet: Modeling diverse variations of regional wind speed for multi-step predictions.WDMNet：为多步预测建模区域风速的多种变化。

Neural Netw. 2023 May;162:147-161. doi: 10.1016/j.neunet.2023.02.024. Epub 2023 Feb 22.

Unsupervised Action Proposals Using Support Vector Classifiers for Online Video Processing.基于支持向量分类器的无监督动作建议在在线视频处理中的应用。

Sensors (Basel). 2020 May 22;20(10):2953. doi: 10.3390/s20102953.

Assessment and statistical modeling of the relationship between remotely sensed aerosol optical depth and PM2.5 in the eastern United States.美国东部地区遥感气溶胶光学厚度与PM2.5之间关系的评估及统计建模

Res Rep Health Eff Inst. 2012 May(167):5-83; discussion 85-91.

UNIMEMnet: Learning long-term motion and appearance dynamics for video prediction with a unified memory network.UNIMEMnet：使用统一记忆网络学习用于视频预测的长期运动和外观动态。

Neural Netw. 2023 Nov;168:256-271. doi: 10.1016/j.neunet.2023.09.024. Epub 2023 Sep 21.

Unsupervised feature disentanglement for video retrieval in minimally invasive surgery.非监督特征解缠用于微创手术中的视频检索。

Med Image Anal. 2022 Jan;75:102296. doi: 10.1016/j.media.2021.102296. Epub 2021 Nov 3.

Data-driven modeling and prediction of blood glucose dynamics: Machine learning applications in type 1 diabetes.基于数据驱动的血糖动力学建模与预测：机器学习在 1 型糖尿病中的应用。

Artif Intell Med. 2019 Jul;98:109-134. doi: 10.1016/j.artmed.2019.07.007. Epub 2019 Jul 26.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于无监督视频预测的随机偏微分方程动力学解析

Disentangling Stochastic PDE Dynamics for Unsupervised Video Prediction.

作者信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献