• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于深度强化学习策略的无人机自主跟踪与着陆

UAV Autonomous Tracking and Landing Based on Deep Reinforcement Learning Strategy.

作者信息

Xie Jingyi, Peng Xiaodong, Wang Haijiao, Niu Wenlong, Zheng Xiao

机构信息

Key Laboratory of Electronics and Information Technology for Space System, National Space Science Center, Chinese Academy of Sciences, Beijing 100190, China.

University of Chinese Academy of Sciences, Beijing 100049, China.

出版信息

Sensors (Basel). 2020 Oct 1;20(19):5630. doi: 10.3390/s20195630.

DOI:10.3390/s20195630
PMID:33019747
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7582896/
Abstract

Unmanned aerial vehicle (UAV) autonomous tracking and landing is playing an increasingly important role in military and civil applications. In particular, machine learning has been successfully introduced to robotics-related tasks. A novel UAV autonomous tracking and landing approach based on a deep reinforcement learning strategy is presented in this paper, with the aim of dealing with the UAV motion control problem in an unpredictable and harsh environment. Instead of building a prior model and inferring the landing actions based on heuristic rules, a model-free method based on a partially observable Markov decision process (POMDP) is proposed. In the POMDP model, the UAV automatically learns the landing maneuver by an end-to-end neural network, which combines the Deep Deterministic Policy Gradients (DDPG) algorithm and heuristic rules. A Modular Open Robots Simulation Engine (MORSE)-based reinforcement learning framework is designed and validated with a continuous UAV tracking and landing task on a randomly moving platform in high sensor noise and intermittent measurements. The simulation results show that when the moving platform is moving in different trajectories, the average landing success rate of the proposed algorithm is about 10% higher than that of the Proportional-Integral-Derivative (PID) method. As an indirect result, a state-of-the-art deep reinforcement learning-based UAV control method is validated, where the UAV can learn the optimal strategy of a continuously autonomous landing and perform properly in a simulation environment.

摘要

无人机自主跟踪与着陆在军事和民用应用中发挥着越来越重要的作用。特别是,机器学习已成功应用于与机器人相关的任务。本文提出了一种基于深度强化学习策略的新型无人机自主跟踪与着陆方法,旨在解决无人机在不可预测和恶劣环境中的运动控制问题。该方法不是构建先验模型并基于启发式规则推断着陆动作,而是提出了一种基于部分可观测马尔可夫决策过程(POMDP)的无模型方法。在POMDP模型中,无人机通过端到端神经网络自动学习着陆机动,该网络结合了深度确定性策略梯度(DDPG)算法和启发式规则。设计了一个基于模块化开放机器人仿真引擎(MORSE)的强化学习框架,并在高传感器噪声和间歇性测量的随机移动平台上,通过连续的无人机跟踪与着陆任务进行了验证。仿真结果表明,当移动平台沿不同轨迹移动时,所提算法的平均着陆成功率比比例积分微分(PID)方法高约10%。作为间接结果,验证了一种基于深度强化学习的先进无人机控制方法,该方法中无人机可以学习连续自主着陆的最优策略,并在仿真环境中正常运行。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec55/7582896/77f565ca0b03/sensors-20-05630-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec55/7582896/17ff7a5b5513/sensors-20-05630-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec55/7582896/617f8404d4bd/sensors-20-05630-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec55/7582896/276bd028404d/sensors-20-05630-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec55/7582896/7ded88293889/sensors-20-05630-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec55/7582896/1ba4f50bedd1/sensors-20-05630-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec55/7582896/0b0b2db845d1/sensors-20-05630-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec55/7582896/97b7aa547b69/sensors-20-05630-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec55/7582896/d37fc49b3fe0/sensors-20-05630-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec55/7582896/06df37ac908e/sensors-20-05630-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec55/7582896/77f565ca0b03/sensors-20-05630-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec55/7582896/17ff7a5b5513/sensors-20-05630-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec55/7582896/617f8404d4bd/sensors-20-05630-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec55/7582896/276bd028404d/sensors-20-05630-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec55/7582896/7ded88293889/sensors-20-05630-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec55/7582896/1ba4f50bedd1/sensors-20-05630-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec55/7582896/0b0b2db845d1/sensors-20-05630-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec55/7582896/97b7aa547b69/sensors-20-05630-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec55/7582896/d37fc49b3fe0/sensors-20-05630-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec55/7582896/06df37ac908e/sensors-20-05630-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec55/7582896/77f565ca0b03/sensors-20-05630-g010.jpg

相似文献

1
UAV Autonomous Tracking and Landing Based on Deep Reinforcement Learning Strategy.基于深度强化学习策略的无人机自主跟踪与着陆
Sensors (Basel). 2020 Oct 1;20(19):5630. doi: 10.3390/s20195630.
2
Research on Aerial Autonomous Docking and Landing Technology of Dual Multi-Rotor UAV.双多旋翼无人机空中自主对接与着陆技术研究。
Sensors (Basel). 2022 Nov 22;22(23):9066. doi: 10.3390/s22239066.
3
MNNMs Integrated Control for UAV Autonomous Tracking Randomly Moving Target Based on Learning Method.基于学习方法的无人机自主跟踪随机移动目标的多神经网络集成控制
Sensors (Basel). 2021 Nov 2;21(21):7307. doi: 10.3390/s21217307.
4
A UAV Maneuver Decision-Making Algorithm for Autonomous Airdrop Based on Deep Reinforcement Learning.一种基于深度强化学习的无人机自主空投机动决策算法
Sensors (Basel). 2021 Mar 23;21(6):2233. doi: 10.3390/s21062233.
5
Proactive Guidance for Accurate UAV Landing on a Dynamic Platform: A Visual-Inertial Approach.用于无人机在动态平台上精确着陆的主动引导:一种视觉惯性方法。
Sensors (Basel). 2022 Jan 5;22(1):404. doi: 10.3390/s22010404.
6
Deep Reinforcement Learning Approach with Multiple Experience Pools for UAV's Autonomous Motion Planning in Complex Unknown Environments.深度强化学习方法与多经验池在复杂未知环境中无人机自主运动规划。
Sensors (Basel). 2020 Mar 29;20(7):1890. doi: 10.3390/s20071890.
7
Searching and Tracking an Unknown Number of Targets: A Learning-Based Method Enhanced with Maps Merging.搜索和跟踪未知数量的目标:一种基于学习并通过地图合并增强的方法。
Sensors (Basel). 2021 Feb 4;21(4):1076. doi: 10.3390/s21041076.
8
Deep Reinforcement Learning-Based End-to-End Control for UAV Dynamic Target Tracking.基于深度强化学习的无人机动态目标跟踪端到端控制
Biomimetics (Basel). 2022 Nov 11;7(4):197. doi: 10.3390/biomimetics7040197.
9
Dynamic Object Tracking on Autonomous UAV System for Surveillance Applications.自主无人机系统上的动态目标跟踪用于监控应用。
Sensors (Basel). 2021 Nov 27;21(23):7888. doi: 10.3390/s21237888.
10
Autonomous Landing of Quadrotor Unmanned Aerial Vehicles Based on Multi-Level Marker and Linear Active Disturbance Reject Control.基于多级标记和线性自抗扰控制的四旋翼无人机自主着陆
Sensors (Basel). 2024 Mar 2;24(5):1645. doi: 10.3390/s24051645.

引用本文的文献

1
Trajectory Tracking Controller for Quadrotor by Continual Reinforcement Learning in Wind-Disturbed Environment.风力干扰环境下基于持续强化学习的四旋翼轨迹跟踪控制器
Sensors (Basel). 2025 Aug 8;25(16):4895. doi: 10.3390/s25164895.
2
Comprehensive Investigation of Unmanned Aerial Vehicles (UAVs): An In-Depth Analysis of Avionics Systems.无人机综合研究:航空电子系统的深入分析
Sensors (Basel). 2024 May 11;24(10):3064. doi: 10.3390/s24103064.
3
MNNMs Integrated Control for UAV Autonomous Tracking Randomly Moving Target Based on Learning Method.

本文引用的文献

1
Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。
Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.
基于学习方法的无人机自主跟踪随机移动目标的多神经网络集成控制
Sensors (Basel). 2021 Nov 2;21(21):7307. doi: 10.3390/s21217307.
4
Learning to Have a Civil Aircraft Take Off under Crosswind Conditions by Reinforcement Learning with Multimodal Data and Preprocessing Data.基于多模态数据和预处理数据的强化学习实现民航飞机在侧风条件下起飞的学习
Sensors (Basel). 2021 Feb 16;21(4):1386. doi: 10.3390/s21041386.