MagicTime: Time-Lapse Video Generation Models as Metamorphic Simulators.

Suppr

超能文献

作者信息

Yuan Shenghai, Huang Jinfa, Shi Yujun, Xu Yongqi, Zhu Ruijie, Lin Bin, Cheng Xinhua, Yuan Li, Luo Jiebo

出版信息

IEEE Trans Pattern Anal Mach Intell. 2025 Sep;47(9):7340-7351. doi: 10.1109/TPAMI.2025.3558507.

DOI:10.1109/TPAMI.2025.3558507

PMID:40198299

Abstract

Recent advances in text-to-video generation (T2V) have achieved remarkable success in synthesizing high-quality general videos from textual descriptions. A largely overlooked problem in T2V is that existing models have not adequately encoded physical knowledge of the real world, thus generated videos tend to have limited motion and poor variations. In this paper, we propose MagicTime, a metamorphic time-lapse video generation model, which learns real-world physics knowledge from time-lapse videos and implements metamorphic generation. First, we design a simple yet effective two-stage Magic Adaptive Strategy, encode more physical knowledge from metamorphic videos, and transform pre-trained T2V models to generate metamorphic videos. Second, we introduce a Dynamic Frames Extraction strategy to adapt to metamorphic time-lapse videos, which have a wider variation range and cover dramatic object metamorphic processes, thus embodying more physical knowledge than general videos. Finally, we introduce a Magic Text-Encoder to improve the understanding of metamorphic video prompts. Furthermore, we create a time-lapse video-text dataset called ChronoMagic, specifically curated to unlock the metamorphic video generation ability. Extensive experiments demonstrate the superiority and effectiveness of MagicTime for generating high-quality and dynamic metamorphic videos, suggesting time-lapse video generation is a promising path toward building metamorphic simulators of the physical world.

摘要

相似文献

MagicTime: Time-Lapse Video Generation Models as Metamorphic Simulators.

IEEE Trans Pattern Anal Mach Intell. 2025 Sep;47(9):7340-7351. doi: 10.1109/TPAMI.2025.3558507.

NADM: Noise-Aware Diffusion Model for Landscape Painting Video Generation.NADM：用于山水画视频生成的噪声感知扩散模型。

IEEE Trans Cybern. 2025 Aug;55(8):3686-3698. doi: 10.1109/TCYB.2025.3576752.

A Comprehensive and Modality Diverse Cervical Spine and Back Musculoskeletal Physical Exam Curriculum for Medical Students.面向医学生的全面且多模态的颈椎和背部肌肉骨骼物理检查课程

J Educ Teach Emerg Med. 2025 Jul 31;10(3):SG1-SG8. doi: 10.21980/J8RQ0N. eCollection 2025 Jul.

Compositional Physical Reasoning of Objects and Events From Videos.基于视频的物体与事件的组合物理推理

IEEE Trans Pattern Anal Mach Intell. 2025 Sep;47(9):7689-7703. doi: 10.1109/TPAMI.2025.3574322.

Generation of Fundus Fluorescein Angiography Videos for Health Care Data Sharing.用于医疗保健数据共享的眼底荧光血管造影视频生成

JAMA Ophthalmol. 2025 Jun 26. doi: 10.1001/jamaophthalmol.2025.1419.

Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.系统性药理学治疗慢性斑块状银屑病：网络荟萃分析。

Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.

Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.慢性斑块状银屑病的全身药理学治疗：一项网状Meta分析。

Cochrane Database Syst Rev. 2020 Jan 9;1(1):CD011535. doi: 10.1002/14651858.CD011535.pub3.

Watch and learn: leveraging expert knowledge and language for surgical video understanding.观看并学习：利用专业知识和语言进行手术视频理解。

Int J Comput Assist Radiol Surg. 2025 Jul 2. doi: 10.1007/s11548-025-03472-4.

Radiology report generation using automatic keyword adaptation, frequency-based multi-label classification and text-to-text large language models.使用自动关键词适配、基于频率的多标签分类和文本到文本的大语言模型生成放射学报告。

Comput Biol Med. 2025 Jul 3;196(Pt A):110625. doi: 10.1016/j.compbiomed.2025.110625.

Social Media Depictions of the Impact of Noise Pollution on Communication and Mental and Physical Health.社交媒体对噪声污染对沟通以及身心健康影响的描述

J Community Health. 2025 Mar 13. doi: 10.1007/s10900-025-01457-7.

引用本文的文献

A Perspective on Quality Evaluation for AI-Generated Videos.人工智能生成视频的质量评估视角

Sensors (Basel). 2025 Jul 28;25(15):4668. doi: 10.3390/s25154668.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验