• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于质量-多样性的强化学习半自动遥操作。

Quality-diversity based semi-autonomous teleoperation using reinforcement learning.

机构信息

Department of Artificial Intelligence, Korea University, Seoul, 02841, South Korea.

Department of Artificial Intelligence, Korea University, Seoul, 02841, South Korea.

出版信息

Neural Netw. 2024 Nov;179:106543. doi: 10.1016/j.neunet.2024.106543. Epub 2024 Jul 22.

DOI:10.1016/j.neunet.2024.106543
PMID:39089158
Abstract

Recent successes in robot learning have significantly enhanced autonomous systems across a wide range of tasks. However, they are prone to generate similar or the same solutions, limiting the controllability of the robot to behave according to user intentions. These limited robot behaviors may lead to collisions and potential harm to humans. To resolve these limitations, we introduce a semi-autonomous teleoperation framework that enables users to operate a robot by selecting a high-level command, referred to as option. Our approach aims to provide effective and diverse options by a learned policy, thereby enhancing the efficiency of the proposed framework. In this work, we propose a quality-diversity (QD) based sampling method that simultaneously optimizes both the quality and diversity of options using reinforcement learning (RL). Additionally, we present a mixture of latent variable models to learn multiple policy distributions defined as options. In experiments, we show that the proposed method achieves superior performance in terms of the success rate and diversity of the options in simulation environments. We further demonstrate that our method outperforms manual keyboard control for time duration over cluttered real-world environments.

摘要

最近机器人学习方面的成功显著增强了自主系统在广泛任务中的能力。然而,它们容易生成相似或相同的解决方案,限制了机器人根据用户意图进行可控行为的能力。这些有限的机器人行为可能导致碰撞和对人类的潜在伤害。为了解决这些限制,我们引入了一种半自主遥操作框架,使用户能够通过选择高级命令(称为选项)来操作机器人。我们的方法旨在通过学习策略提供有效且多样化的选项,从而提高所提出框架的效率。在这项工作中,我们提出了一种基于质量多样性 (QD) 的采样方法,该方法使用强化学习 (RL) 同时优化选项的质量和多样性。此外,我们提出了一种混合潜在变量模型来学习多个策略分布,这些策略分布被定义为选项。在实验中,我们表明,所提出的方法在模拟环境中的选项成功率和多样性方面表现出优越的性能。我们进一步证明,我们的方法在杂乱的现实环境中,在耗时方面优于手动键盘控制。

相似文献

1
Quality-diversity based semi-autonomous teleoperation using reinforcement learning.基于质量-多样性的强化学习半自动遥操作。
Neural Netw. 2024 Nov;179:106543. doi: 10.1016/j.neunet.2024.106543. Epub 2024 Jul 22.
2
RL-DOVS: Reinforcement Learning for Autonomous Robot Navigation in Dynamic Environments.RL-DOVS:动态环境下自主机器人导航的强化学习。
Sensors (Basel). 2022 May 19;22(10):3847. doi: 10.3390/s22103847.
3
A learning-based semi-autonomous controller for robotic exploration of unknown disaster scenes while searching for victims.基于学习的半自主控制器,用于在搜索受害者的同时对未知灾难场景进行机器人探索。
IEEE Trans Cybern. 2014 Dec;44(12):2719-32. doi: 10.1109/TCYB.2014.2314294. Epub 2014 Apr 18.
4
ASAP-CORPS: A Semi-Autonomous Platform for COntact-Rich Precision Surgery.ASAP-CORPS:一种用于接触丰富的精准手术的半自主平台。
Mil Med. 2023 Nov 8;188(Suppl 6):412-419. doi: 10.1093/milmed/usad175.
5
Energy-efficient and damage-recovery slithering gait design for a snake-like robot based on reinforcement learning and inverse reinforcement learning.基于强化学习和逆强化学习的蛇形机器人节能与损伤恢复蠕动步态设计。
Neural Netw. 2020 Sep;129:323-333. doi: 10.1016/j.neunet.2020.05.029. Epub 2020 Jun 16.
6
Error-related potential-based shared autonomy via deep recurrent reinforcement learning.基于错误相关电位的深度递归强化学习的共享自主性。
J Neural Eng. 2022 Dec 5;19(6). doi: 10.1088/1741-2552/aca4fb.
7
A Survey of Sim-to-Real Transfer Techniques Applied to Reinforcement Learning for Bioinspired Robots.应用于生物启发机器人强化学习的仿真到真实迁移技术综述。
IEEE Trans Neural Netw Learn Syst. 2023 Jul;34(7):3444-3459. doi: 10.1109/TNNLS.2021.3112718. Epub 2023 Jul 6.
8
Discovering diverse solutions in deep reinforcement learning by maximizing state-action-based mutual information.通过最大化基于状态-动作的互信息在深度强化学习中发现多样的解决方案。
Neural Netw. 2022 Aug;152:90-104. doi: 10.1016/j.neunet.2022.04.009. Epub 2022 Apr 16.
9
GeneWorker: An end-to-end robotic reinforcement learning approach with collaborative generator and worker networks.基因工作者:一种具有协作生成器和工作网络的端到端机器人强化学习方法。
Neural Netw. 2024 Oct;178:106472. doi: 10.1016/j.neunet.2024.106472. Epub 2024 Jun 18.
10
Visual Pretraining via Contrastive Predictive Model for Pixel-Based Reinforcement Learning.基于像素的强化学习的对比预测模型的视觉预训练。
Sensors (Basel). 2022 Aug 29;22(17):6504. doi: 10.3390/s22176504.

引用本文的文献

1
An Intuitive and Efficient Teleoperation Human-Robot Interface Based on a Wearable Myoelectric Armband.基于可穿戴肌电臂带的直观高效遥操作人机接口
Biomimetics (Basel). 2025 Jul 15;10(7):464. doi: 10.3390/biomimetics10070464.