Wolf Rosa, Shi Yitian, Liu Sheng, Rayyes Rania
AI and Robotics (AIR), Institute of Material Handling and Logistics (IFL), Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany.
Front Robot AI. 2025 Sep 9;12:1606247. doi: 10.3389/frobt.2025.1606247. eCollection 2025.
Diffusion generative models have demonstrated remarkable success in visual domains such as image and video generation. They have also recently emerged as a promising approach in robotics, especially in robot manipulations. Diffusion models leverage a probabilistic framework, and they stand out with their ability to model multi-modal distributions and their robustness to high-dimensional input and output spaces. This survey provides a comprehensive review of state-of-the-art diffusion models in robotic manipulation, including grasp learning, trajectory planning, and data augmentation. Diffusion models for scene and image augmentation lie at the intersection of robotics and computer vision for vision-based tasks to enhance generalizability and data scarcity. This paper also presents the two main frameworks of diffusion models and their integration with imitation learning and reinforcement learning. In addition, it discusses the common architectures and benchmarks and points out the challenges and advantages of current state-of-the-art diffusion-based methods.
扩散生成模型在图像和视频生成等视觉领域已取得显著成功。它们最近在机器人技术中也成为一种很有前景的方法,特别是在机器人操作方面。扩散模型利用概率框架,以其对多模态分布进行建模的能力以及对高维输入和输出空间的鲁棒性脱颖而出。本综述全面回顾了机器人操作中最先进的扩散模型,包括抓取学习、轨迹规划和数据增强。用于场景和图像增强的扩散模型处于机器人技术和计算机视觉的交叉点,用于基于视觉的任务,以提高通用性和解决数据稀缺问题。本文还介绍了扩散模型的两个主要框架及其与模仿学习和强化学习的集成。此外,它讨论了常见的架构和基准,并指出了当前最先进的基于扩散的方法的挑战和优势。