• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于使四足机器人模拟器能够在各种地形上行走的基于中枢模式发生器的分层强化学习。

Hierarchical reinforcement learning with central pattern generator for enabling a quadruped robot simulator to walk on a variety of terrains.

作者信息

Watanabe Toshiki, Kubo Akihiro, Tsunoda Kai, Matsuba Tatsuya, Akatsuka Shintaro, Noda Yukihiro, Kioka Hiroaki, Izawa Jin, Ishii Shin, Nakamura Yutaka

机构信息

Kyoto University, Kyoto, Japan.

Advanced Telecommunications Research Institute, Kyoto, Japan.

出版信息

Sci Rep. 2025 Apr 2;15(1):11262. doi: 10.1038/s41598-025-94163-2.

DOI:10.1038/s41598-025-94163-2
PMID:40175434
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11965509/
Abstract

We present a data-driven deep reinforcement learning (DRL) method for the optimization of a hierarchically structured control policy that includes the central pattern generator. This method, which is as a whole referred to as the hierarchical reinforcement learning with the central pattern generator (HRL-CPG), is then evaluated with the expectation of its applicability in real robot controls. We observed that stable gait motions were gained in a reasonably small number of trials and errors. Thus, it can be deduced that our HRL-CPG can be a candidate DRL method that enables dynamical systems such as real or realistic robots to adapt to a variety of environments within a moderate physical time.

摘要

我们提出了一种数据驱动的深度强化学习(DRL)方法,用于优化包含中枢模式发生器的分层结构控制策略。这种方法整体上被称为带有中枢模式发生器的分层强化学习(HRL-CPG),随后对其在实际机器人控制中的适用性进行了评估。我们观察到,在合理数量的试验和错误中获得了稳定的步态运动。因此,可以推断出我们的HRL-CPG可以成为一种候选的DRL方法,使诸如真实或逼真机器人等动态系统能够在适度的物理时间内适应各种环境。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/2892d63ca7e9/41598_2025_94163_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/6778c35d3b8e/41598_2025_94163_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/0850292df8a9/41598_2025_94163_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/dd3556caf753/41598_2025_94163_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/917439258868/41598_2025_94163_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/dee9bfcd5810/41598_2025_94163_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/fcb050943644/41598_2025_94163_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/d0e0a2ebf8d0/41598_2025_94163_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/f281d1624068/41598_2025_94163_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/28b404c2372b/41598_2025_94163_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/13b87d524319/41598_2025_94163_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/6c5059d4a8d8/41598_2025_94163_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/ffe08c720f9c/41598_2025_94163_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/596fc795492f/41598_2025_94163_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/5200ff739414/41598_2025_94163_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/2892d63ca7e9/41598_2025_94163_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/6778c35d3b8e/41598_2025_94163_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/0850292df8a9/41598_2025_94163_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/dd3556caf753/41598_2025_94163_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/917439258868/41598_2025_94163_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/dee9bfcd5810/41598_2025_94163_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/fcb050943644/41598_2025_94163_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/d0e0a2ebf8d0/41598_2025_94163_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/f281d1624068/41598_2025_94163_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/28b404c2372b/41598_2025_94163_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/13b87d524319/41598_2025_94163_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/6c5059d4a8d8/41598_2025_94163_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/ffe08c720f9c/41598_2025_94163_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/596fc795492f/41598_2025_94163_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/5200ff739414/41598_2025_94163_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b097/11965509/2892d63ca7e9/41598_2025_94163_Fig14_HTML.jpg

相似文献

1
Hierarchical reinforcement learning with central pattern generator for enabling a quadruped robot simulator to walk on a variety of terrains.用于使四足机器人模拟器能够在各种地形上行走的基于中枢模式发生器的分层强化学习。
Sci Rep. 2025 Apr 2;15(1):11262. doi: 10.1038/s41598-025-94163-2.
2
Deep Learning Vision System for Quadruped Robot Gait Pattern Regulation.用于四足机器人步态模式调节的深度学习视觉系统
Biomimetics (Basel). 2023 Jul 3;8(3):289. doi: 10.3390/biomimetics8030289.
3
Multi-constraint spatial coupling for the body joint quadruped robot and the CPG control method on rough terrain.多约束空间耦合的身体关节四足机器人和在崎岖地形上的 CP G 控制方法。
Bioinspir Biomim. 2023 Sep 4;18(5). doi: 10.1088/1748-3190/acf357.
4
Hybrid learning mechanisms under a neural control network for various walking speed generation of a quadruped robot.用于四足机器人各种行走速度生成的神经控制网络下的混合学习机制。
Neural Netw. 2023 Oct;167:292-308. doi: 10.1016/j.neunet.2023.08.030. Epub 2023 Aug 26.
5
CPG-inspired workspace trajectory generation and adaptive locomotion control for quadruped robots.受CPG启发的四足机器人工作空间轨迹生成与自适应运动控制
IEEE Trans Syst Man Cybern B Cybern. 2011 Jun;41(3):867-80. doi: 10.1109/TSMCB.2010.2097589. Epub 2011 Jan 6.
6
Viability leads to the emergence of gait transitions in learning agile quadrupedal locomotion on challenging terrains.在具有挑战性的地形上学习敏捷四足运动时,生存能力导致步态转换的出现。
Nat Commun. 2024 Apr 9;15(1):3073. doi: 10.1038/s41467-024-47443-w.
7
A Hierarchical Framework for Quadruped Robots Gait Planning Based on DDPG.一种基于深度确定性策略梯度(DDPG)的四足机器人步态规划分层框架。
Biomimetics (Basel). 2023 Aug 22;8(5):382. doi: 10.3390/biomimetics8050382.
8
Exploring Behaviors of Caterpillar-Like Soft Robots with a Central Pattern Generator-Based Controller and Reinforcement Learning.基于中央模式发生器的控制器和强化学习探索毛毛虫状软体机器人的行为。
Soft Robot. 2019 Oct;6(5):579-594. doi: 10.1089/soro.2018.0126. Epub 2019 May 20.
9
Learning and Reusing Quadruped Robot Movement Skills from Biological Dogs for Higher-Level Tasks.从生物狗身上学习和复用四足机器人运动技能以完成更高层次的任务。
Sensors (Basel). 2023 Dec 20;24(1):28. doi: 10.3390/s24010028.
10
ARS: AI-Driven Recovery Controller for Quadruped Robot Using Single-Network Model.ARS:基于单网络模型的四足机器人人工智能驱动恢复控制器
Biomimetics (Basel). 2024 Dec 10;9(12):749. doi: 10.3390/biomimetics9120749.

本文引用的文献

1
Viability leads to the emergence of gait transitions in learning agile quadrupedal locomotion on challenging terrains.在具有挑战性的地形上学习敏捷四足运动时,生存能力导致步态转换的出现。
Nat Commun. 2024 Apr 9;15(1):3073. doi: 10.1038/s41467-024-47443-w.
2
Deep reinforcement learning in medical imaging: A literature review.深度强化学习在医学成像中的应用:文献综述。
Med Image Anal. 2021 Oct;73:102193. doi: 10.1016/j.media.2021.102193. Epub 2021 Jul 27.
3
Learning agile and dynamic motor skills for legged robots.学习用于腿部机器人的敏捷和动态运动技能。
Sci Robot. 2019 Jan 16;4(26). doi: 10.1126/scirobotics.aau5872.
4
A Minimal Model Describing Hexapedal Interlimb Coordination: The Tegotae-Based Approach.一种描述六足动物肢体间协调的最小模型:基于Tegotae的方法。
Front Neurorobot. 2017 Jun 9;11:29. doi: 10.3389/fnbot.2017.00029. eCollection 2017.
5
Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。
Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.
6
Analysis of a neural oscillator.神经振荡器分析
Biol Cybern. 2011 May;104(4-5):297-304. doi: 10.1007/s00422-011-0432-z. Epub 2011 May 12.
7
Reinforcement learning for a biped robot based on a CPG-actor-critic method.基于CPG-actor-critic方法的双足机器人强化学习
Neural Netw. 2007 Aug;20(6):723-35. doi: 10.1016/j.neunet.2007.01.002. Epub 2007 Feb 20.
8
From swimming to walking with a salamander robot driven by a spinal cord model.从游泳到借助由脊髓模型驱动的蝾螈机器人行走。
Science. 2007 Mar 9;315(5817):1416-20. doi: 10.1126/science.1138353.
9
Control of locomotion in the decerebrate cat.去大脑猫的运动控制
Prog Neurobiol. 1996 Aug;49(5):481-515. doi: 10.1016/0301-0082(96)00028-7.
10
Mechanisms of frequency and pattern control in the neural rhythm generators.神经节律发生器中频率和模式控制的机制。
Biol Cybern. 1987;56(5-6):345-53. doi: 10.1007/BF00319514.