Suppr超能文献

基于带变换奖励的 Sarsa 的有绳大黄蜂腹部摆动控制。

Abdominal-Waving Control of Tethered Bumblebees Based on Sarsa With Transformed Reward.

出版信息

IEEE Trans Cybern. 2019 Aug;49(8):3064-3073. doi: 10.1109/TCYB.2018.2838595. Epub 2018 Jun 22.

Abstract

Cyborg insects have attracted great attention as the flight performance they have is incomparable by micro aerial vehicles and play a critical role in supporting extensive applications. Approaches to construct cyborg insects consist of two major issues: 1) the stimulating paradigm and 2) the control policy. At present, most cyborg insects are constructed based on invasive methods, requiring the implantation of electrodes into neural or muscle systems, which would harm the insects. As the control policy is basically manual control, the shortcomings of which lie in the requirement of excessive amount of experiments and focused attention. This paper presents the design and implementation of a noninvasive and much safer cyborg insect system based on visual stimulation. The tethered paradigm is adopted here and we look at controlling the flight behavior of bumblebees, especially the abdominal-waving behavior, in the context of a model-free reinforcement learning problem. The problem is formulated as a finite and deterministic Markov decision process, where the agent is designed to change the abdominal-waving behavior from the initial state to the target state. Sarsa with transformed reward function which can speed up the learning process is employed to learn the optimal control policy. Learned policies are compared to the stochastic one by evaluating the results of ten bumblebees, demonstrating that abdominal-waving state can be modulated to approximate the target state quickly with small deviation.

摘要

机器昆虫作为微型飞行器无法比拟的飞行性能吸引了人们的极大关注,并在支持广泛的应用中发挥了关键作用。构建机器昆虫的方法主要包括两个方面:1)刺激范式和 2)控制策略。目前,大多数机器昆虫都是基于侵入性方法构建的,需要将电极植入神经或肌肉系统,这会对昆虫造成伤害。由于控制策略基本上是手动控制,其缺点在于需要进行大量的实验和集中注意力。本文提出了一种基于视觉刺激的非侵入性和更安全的机器昆虫系统的设计和实现。这里采用了系留范式,并着眼于在无模型强化学习问题的背景下控制大黄蜂的飞行行为,特别是腹部摆动行为。该问题被表述为一个有限和确定性的马尔可夫决策过程,其中代理被设计为将腹部摆动行为从初始状态改变到目标状态。采用具有可加速学习过程的变换奖励函数的 Sarsa 来学习最优控制策略。通过评估十只大黄蜂的结果,比较学习到的策略和随机策略,表明腹部摆动状态可以快速调节到近似目标状态,且偏差较小。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验