• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

模仿学习综述:算法、最新进展与挑战

A Survey of Imitation Learning: Algorithms, Recent Developments, and Challenges.

作者信息

Zare Maryam, Kebria Parham M, Khosravi Abbas, Nahavandi Saeid

出版信息

IEEE Trans Cybern. 2024 Dec;54(12):7173-7186. doi: 10.1109/TCYB.2024.3395626. Epub 2024 Nov 27.

DOI:10.1109/TCYB.2024.3395626
PMID:39024072
Abstract

In recent years, the development of robotics and artificial intelligence (AI) systems has been nothing short of remarkable. As these systems continue to evolve, they are being utilized in increasingly complex and unstructured environments, such as autonomous driving, aerial robotics, and natural language processing. As a consequence, programming their behaviors manually or defining their behavior through the reward functions [as done in reinforcement learning (RL)] has become exceedingly difficult. This is because such environments require a high degree of flexibility and adaptability, making it challenging to specify an optimal set of rules or reward signals that can account for all the possible situations. In such environments, learning from an expert's behavior through imitation is often more appealing. This is where imitation learning (IL) comes into play - a process where desired behavior is learned by imitating an expert's behavior, which is provided through demonstrations.This article aims to provide an introduction to IL and an overview of its underlying assumptions and approaches. It also offers a detailed description of recent advances and emerging areas of research in the field. Additionally, this article discusses how researchers have addressed common challenges associated with IL and provides potential directions for future research. Overall, the goal of this article is to provide a comprehensive guide to the growing field of IL in robotics and AI.

摘要

近年来,机器人技术和人工智能(AI)系统的发展堪称卓越。随着这些系统不断演进,它们正被应用于越来越复杂和非结构化的环境中,如自动驾驶、空中机器人技术以及自然语言处理。因此,手动对其行为进行编程或通过奖励函数(如在强化学习(RL)中那样)来定义其行为变得极其困难。这是因为此类环境需要高度的灵活性和适应性,要指定一组能涵盖所有可能情况的最优规则或奖励信号颇具挑战。在这样的环境中,通过模仿专家行为进行学习通常更具吸引力。这就是模仿学习(IL)发挥作用之处——通过模仿由示范提供的专家行为来学习期望行为的过程。本文旨在介绍模仿学习,并概述其基本假设和方法。它还详细描述了该领域的最新进展和新兴研究领域。此外,本文讨论了研究人员如何应对与模仿学习相关的常见挑战,并为未来研究提供了潜在方向。总体而言,本文的目标是为机器人技术和人工智能中不断发展的模仿学习领域提供全面指南。

相似文献

1
A Survey of Imitation Learning: Algorithms, Recent Developments, and Challenges.模仿学习综述:算法、最新进展与挑战
IEEE Trans Cybern. 2024 Dec;54(12):7173-7186. doi: 10.1109/TCYB.2024.3395626. Epub 2024 Nov 27.
2
Leveraging imitation learning in agricultural robotics: a comprehensive survey and comparative analysis.农业机器人中模仿学习的应用:全面综述与比较分析
Front Robot AI. 2024 Oct 17;11:1441312. doi: 10.3389/frobt.2024.1441312. eCollection 2024.
3
Inverse Reinforcement Q-Learning Through Expert Imitation for Discrete-Time Systems.基于专家模仿的离散时间系统逆强化Q学习
IEEE Trans Neural Netw Learn Syst. 2023 May;34(5):2386-2399. doi: 10.1109/TNNLS.2021.3106635. Epub 2023 May 2.
4
Reinforcement Learning Algorithms and Applications in Healthcare and Robotics: A Comprehensive and Systematic Review.强化学习算法及其在医疗保健与机器人技术中的应用:一项全面且系统的综述
Sensors (Basel). 2024 Apr 11;24(8):2461. doi: 10.3390/s24082461.
5
Generative Upper-Level Policy Imitation Learning With Pareto-Improvement for Energy-Efficient Advanced Machining Systems.用于节能先进加工系统的具有帕累托改进的生成式高层策略模仿学习
IEEE Trans Neural Netw Learn Syst. 2025 Mar;36(3):5190-5203. doi: 10.1109/TNNLS.2024.3372641. Epub 2025 Feb 28.
6
MOSAIC for multiple-reward environments.多奖励环境下的 MOSAIC 算法。
Neural Comput. 2012 Mar;24(3):577-606. doi: 10.1162/NECO_a_00246. Epub 2011 Dec 14.
7
Error Bounds of Imitating Policies and Environments for Reinforcement Learning.强化学习中模仿策略和环境的误差界限。
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):6968-6980. doi: 10.1109/TPAMI.2021.3096966. Epub 2022 Sep 14.
8
Forward and inverse reinforcement learning sharing network weights and hyperparameters.正向和反向强化学习共享网络权重和超参数。
Neural Netw. 2021 Dec;144:138-153. doi: 10.1016/j.neunet.2021.08.017. Epub 2021 Aug 20.
9
Artificial Intelligence Surgery: How Do We Get to Autonomous Actions in Surgery?人工智能手术:我们如何实现手术中的自主操作?
Sensors (Basel). 2021 Aug 17;21(16):5526. doi: 10.3390/s21165526.
10
Is imitation learning the route to humanoid robots?模仿学习是类人机器人的发展途径吗?
Trends Cogn Sci. 1999 Jun;3(6):233-242. doi: 10.1016/s1364-6613(99)01327-3.

引用本文的文献

1
Robotic Arm Trajectory Planning in Dynamic Environments Based on Self-Optimizing Replay Mechanism.基于自优化重放机制的动态环境下机器人手臂轨迹规划
Sensors (Basel). 2025 Jul 29;25(15):4681. doi: 10.3390/s25154681.
2
Graph attention networks based multi-agent path finding via temporal-spatial information aggregation.基于时空信息聚合的图注意力网络多智能体路径寻找
PLoS One. 2025 Jun 16;20(6):e0318981. doi: 10.1371/journal.pone.0318981. eCollection 2025.
3
Conditional variational auto encoder based dynamic motion for multitask imitation learning.
基于条件变分自编码器的动态运动用于多任务模仿学习。
Sci Rep. 2025 Mar 17;15(1):9196. doi: 10.1038/s41598-025-93888-4.
4
A highly-configurable session designer for VR nursing training.一款用于虚拟现实护理培训的高度可配置会话设计器。
Heliyon. 2024 Nov 7;10(22):e39692. doi: 10.1016/j.heliyon.2024.e39692. eCollection 2024 Nov 30.
5
Leveraging imitation learning in agricultural robotics: a comprehensive survey and comparative analysis.农业机器人中模仿学习的应用:全面综述与比较分析
Front Robot AI. 2024 Oct 17;11:1441312. doi: 10.3389/frobt.2024.1441312. eCollection 2024.
6
Model Predictive Control with Variational Autoencoders for Signal Temporal Logic Specifications.用于信号时序逻辑规范的基于变分自编码器的模型预测控制。
Sensors (Basel). 2024 Jul 14;24(14):4567. doi: 10.3390/s24144567.