• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于同步强化学习的自平衡自行车机器人组合控制算法

Combined control algorithm based on synchronous reinforcement learning for a self-balancing bicycle robot.

作者信息

Guo Lei, Lin Hongyu, Jiang Jiale, Song Yuan, Gan Dongming

机构信息

School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, China.

School of Engineering Technology, Purdue University, West Lafayette, IN, USA.

出版信息

ISA Trans. 2024 Feb;145:479-492. doi: 10.1016/j.isatra.2023.11.032. Epub 2023 Nov 23.

DOI:10.1016/j.isatra.2023.11.032
PMID:38007371
Abstract

In this paper, balance control of a bicycle robot is studied without either a trail or a mechanical regulator when the robot moves in an approximately rectilinear motion. Based on the principle of moment balance, an input nonaffine nonlinear dynamics model of the bicycle robot is established. A driving velocity condition is proposed to maintain the robot balance. The nonaffine nonlinear system is transformed into an affine nonlinear system by defining the equivalent control. Subsequently, a feedback linearization controller is designed for the equivalent control. We design a combined control algorithm of synchronous policy iteration based on the actor-critic architecture. The actor neural network (NN) is designed based on the feedback linearization control law. Weight tuning laws for the critic and actor NNs are proposed. The system closed-loop stability and convergence of the NN weights are guaranteed based on the Lyapunov analysis. The optimality of the equivalent control policy is guaranteed. To satisfy the driving velocity condition, the values of the steering angle and driving velocity are determined based on the optimal equivalent control. The effectiveness of the proposed algorithm is verified through simulations and real experiments.

摘要

本文研究了自行车机器人在近似直线运动时,无拖尾或机械调节器情况下的平衡控制。基于力矩平衡原理,建立了自行车机器人的输入非仿射非线性动力学模型。提出了一个驱动速度条件以维持机器人平衡。通过定义等效控制,将非仿射非线性系统转化为仿射非线性系统。随后,为等效控制设计了反馈线性化控制器。我们基于行为-评判架构设计了一种同步策略迭代的组合控制算法。行为神经网络(NN)基于反馈线性化控制律进行设计。提出了评判和行为神经网络的权重调整律。基于李雅普诺夫分析保证了系统闭环稳定性和神经网络权重的收敛性。保证了等效控制策略的最优性。为满足驱动速度条件,基于最优等效控制确定转向角和驱动速度的值。通过仿真和实际实验验证了所提算法的有效性。

相似文献

1
Combined control algorithm based on synchronous reinforcement learning for a self-balancing bicycle robot.基于同步强化学习的自平衡自行车机器人组合控制算法
ISA Trans. 2024 Feb;145:479-492. doi: 10.1016/j.isatra.2023.11.032. Epub 2023 Nov 23.
2
Control of nonaffine nonlinear discrete-time systems using reinforcement-learning-based linearly parameterized neural networks.基于强化学习的线性参数化神经网络对非仿射非线性离散时间系统的控制
IEEE Trans Syst Man Cybern B Cybern. 2008 Aug;38(4):994-1001. doi: 10.1109/TSMCB.2008.926607.
3
Reinforcement learning controller design for affine nonlinear discrete-time systems using online approximators.基于在线逼近器的仿射非线性离散时间系统强化学习控制器设计
IEEE Trans Syst Man Cybern B Cybern. 2012 Apr;42(2):377-90. doi: 10.1109/TSMCB.2011.2166384. Epub 2011 Sep 23.
4
Adaptive nearly optimal control for a class of continuous-time nonaffine nonlinear systems with inequality constraints.一类具有不等式约束的连续时间非仿射非线性系统的自适应近乎最优控制
ISA Trans. 2017 Jan;66:122-133. doi: 10.1016/j.isatra.2016.10.019. Epub 2016 Nov 9.
5
Discrete-time online learning control for a class of unknown nonaffine nonlinear systems using reinforcement learning.基于强化学习的一类未知非仿射非线性系统的离散时间在线学习控制。
Neural Netw. 2014 Jul;55:30-41. doi: 10.1016/j.neunet.2014.03.008. Epub 2014 Mar 28.
6
Model-Free Reinforcement Learning by Embedding an Auxiliary System for Optimal Control of Nonlinear Systems.通过嵌入辅助系统实现无模型强化学习以实现非线性系统的最优控制
IEEE Trans Neural Netw Learn Syst. 2022 Apr;33(4):1520-1534. doi: 10.1109/TNNLS.2020.3042589. Epub 2022 Apr 4.
7
Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks.基于策略迭代和神经网络的未知约束输入系统自适应最优控制。
IEEE Trans Neural Netw Learn Syst. 2013 Oct;24(10):1513-25. doi: 10.1109/TNNLS.2013.2276571.
8
Adaptive Actor-Critic Design-Based Integral Sliding-Mode Control for Partially Unknown Nonlinear Systems With Input Disturbances.基于自适应动作-评论家设计的积分滑模控制在存在输入干扰的部分未知非线性系统中的应用。
IEEE Trans Neural Netw Learn Syst. 2016 Jan;27(1):165-77. doi: 10.1109/TNNLS.2015.2472974. Epub 2015 Sep 9.
9
Reinforcement learning output feedback NN control using deterministic learning technique.使用确定性学习技术的强化学习输出反馈神经网络控制。
IEEE Trans Neural Netw Learn Syst. 2014 Mar;25(3):635-41. doi: 10.1109/TNNLS.2013.2292704.
10
Approximate neural optimal control with reinforcement learning for a torsional pendulum device.基于强化学习的扭摆装置近似神经最优控制。
Neural Netw. 2019 Sep;117:1-7. doi: 10.1016/j.neunet.2019.04.026. Epub 2019 May 23.