基于带变换奖励的 Sarsa 的有绳大黄蜂腹部摆动控制。

Abdominal-Waving Control of Tethered Bumblebees Based on Sarsa With Transformed Reward.

出版信息

IEEE Trans Cybern. 2019 Aug;49(8):3064-3073. doi: 10.1109/TCYB.2018.2838595. Epub 2018 Jun 22.

DOI:10.1109/TCYB.2018.2838595

Abstract

Cyborg insects have attracted great attention as the flight performance they have is incomparable by micro aerial vehicles and play a critical role in supporting extensive applications. Approaches to construct cyborg insects consist of two major issues: 1) the stimulating paradigm and 2) the control policy. At present, most cyborg insects are constructed based on invasive methods, requiring the implantation of electrodes into neural or muscle systems, which would harm the insects. As the control policy is basically manual control, the shortcomings of which lie in the requirement of excessive amount of experiments and focused attention. This paper presents the design and implementation of a noninvasive and much safer cyborg insect system based on visual stimulation. The tethered paradigm is adopted here and we look at controlling the flight behavior of bumblebees, especially the abdominal-waving behavior, in the context of a model-free reinforcement learning problem. The problem is formulated as a finite and deterministic Markov decision process, where the agent is designed to change the abdominal-waving behavior from the initial state to the target state. Sarsa with transformed reward function which can speed up the learning process is employed to learn the optimal control policy. Learned policies are compared to the stochastic one by evaluating the results of ten bumblebees, demonstrating that abdominal-waving state can be modulated to approximate the target state quickly with small deviation.

摘要

机器昆虫作为微型飞行器无法比拟的飞行性能吸引了人们的极大关注，并在支持广泛的应用中发挥了关键作用。构建机器昆虫的方法主要包括两个方面：1）刺激范式和 2）控制策略。目前，大多数机器昆虫都是基于侵入性方法构建的，需要将电极植入神经或肌肉系统，这会对昆虫造成伤害。由于控制策略基本上是手动控制，其缺点在于需要进行大量的实验和集中注意力。本文提出了一种基于视觉刺激的非侵入性和更安全的机器昆虫系统的设计和实现。这里采用了系留范式，并着眼于在无模型强化学习问题的背景下控制大黄蜂的飞行行为，特别是腹部摆动行为。该问题被表述为一个有限和确定性的马尔可夫决策过程，其中代理被设计为将腹部摆动行为从初始状态改变到目标状态。采用具有可加速学习过程的变换奖励函数的 Sarsa 来学习最优控制策略。通过评估十只大黄蜂的结果，比较学习到的策略和随机策略，表明腹部摆动状态可以快速调节到近似目标状态，且偏差较小。

相似文献

Abdominal-Waving Control of Tethered Bumblebees Based on Sarsa With Transformed Reward.基于带变换奖励的 Sarsa 的有绳大黄蜂腹部摆动控制。

IEEE Trans Cybern. 2019 Aug;49(8):3064-3073. doi: 10.1109/TCYB.2018.2838595. Epub 2018 Jun 22.

A hybrid cancer prediction based on multi-omics data and reinforcement learning state action reward state action (SARSA).基于多组学数据和强化学习状态动作奖励状态动作 (SARSA) 的混合癌症预测。

Comput Biol Med. 2023 Mar;154:106617. doi: 10.1016/j.compbiomed.2023.106617. Epub 2023 Feb 3.

A Q-Learning Approach to Flocking With UAVs in a Stochastic Environment.基于 Q 学习的无人机在随机环境中的群体行为控制方法。

IEEE Trans Cybern. 2017 Jan;47(1):186-197. doi: 10.1109/TCYB.2015.2509646. Epub 2016 Jan 5.

Learning by observation emerges from simple associations in an insect model.通过观察学习在昆虫模型中产生于简单的联想。

Curr Biol. 2013 Apr 22;23(8):727-30. doi: 10.1016/j.cub.2013.03.035. Epub 2013 Apr 4.

Colour learning when foraging for nectar and pollen: bees learn two colours at once.在觅食花蜜和花粉时的颜色学习：蜜蜂能同时学习两种颜色。

Biol Lett. 2015 Sep;11(9):20150628. doi: 10.1098/rsbl.2015.0628.

Evidence for socially influenced and potentially actively coordinated cooperation by bumblebees.熊蜂存在受社会影响且可能积极协调合作的证据。

Proc Biol Sci. 2024 May;291(2022):20240055. doi: 10.1098/rspb.2024.0055. Epub 2024 May 1.

RICA: a reliable and image configurable arena for cyborg bumblebee based on CAN bus.

Annu Int Conf IEEE Eng Med Biol Soc. 2014;2014:860-3. doi: 10.1109/EMBC.2014.6943727.

Bumblebees at work in an emotion-like state.大黄蜂在类似情绪的状态下工作。

Learn Behav. 2017 Sep;45(3):207-208. doi: 10.3758/s13420-017-0265-2.

Bumblebees measure optic flow for position and speed control flexibly within the frontal visual field.大黄蜂在额视野内灵活地测量光流以进行位置和速度控制。

J Exp Biol. 2015 Apr;218(Pt 7):1051-9. doi: 10.1242/jeb.107409. Epub 2015 Feb 5.

Cyborg Moth Flight Control Based on Fuzzy Deep Learning.基于模糊深度学习的半机械蛾飞行控制

Micromachines (Basel). 2022 Apr 13;13(4):611. doi: 10.3390/mi13040611.

引用本文的文献

Cyborg Moth Flight Control Based on Fuzzy Deep Learning.基于模糊深度学习的半机械蛾飞行控制

Micromachines (Basel). 2022 Apr 13;13(4):611. doi: 10.3390/mi13040611.

Adaptive Sliding Mode Disturbance Observer and Deep Reinforcement Learning Based Motion Control for Micropositioners.基于自适应滑模干扰观测器和深度强化学习的微定位器运动控制

Micromachines (Basel). 2022 Mar 17;13(3):458. doi: 10.3390/mi13030458.

Multisensory-motor integration in olfactory navigation of silkmoth, , using virtual reality system.利用虚拟现实系统研究家蚕嗅觉导航中的多感觉-运动整合。

Elife. 2021 Nov 25;10:e72001. doi: 10.7554/eLife.72001.

Behavioral control and changes in brain activity of honeybee during flapping.蜜蜂扇动翅膀时的行为控制和大脑活动变化。

Brain Behav. 2021 Dec;11(12):e2426. doi: 10.1002/brb3.2426. Epub 2021 Nov 22.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于带变换奖励的 Sarsa 的有绳大黄蜂腹部摆动控制。

Abdominal-Waving Control of Tethered Bumblebees Based on Sarsa With Transformed Reward.

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献