• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于强化学习的多自主水下航行器冰下区域估计自适应轨迹规划

Reinforcement Learning-Based Multi-AUV Adaptive Trajectory Planning for Under-Ice Field Estimation.

机构信息

Department of Electrical and Computer Engineering, Michigan Technological University, Houghton, MI 49931, USA.

Department of Electrical and Computer Engineering, Stevens Institute of Technology, Hoboken, NJ 07030, USA.

出版信息

Sensors (Basel). 2018 Nov 9;18(11):3859. doi: 10.3390/s18113859.

DOI:10.3390/s18113859
PMID:30424017
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6263807/
Abstract

This work studies online learning-based trajectory planning for multiple autonomous underwater vehicles (AUVs) to estimate a water parameter field of interest in the under-ice environment. A centralized system is considered, where several fixed access points on the ice layer are introduced as gateways for communications between the AUVs and a remote data fusion center. We model the water parameter field of interest as a Gaussian process with unknown hyper-parameters. The AUV trajectories for sampling are determined on an epoch-by-epoch basis. At the end of each epoch, the access points relay the observed field samples from all the AUVs to the fusion center, which computes the posterior distribution of the field based on the Gaussian process regression and estimates the field hyper-parameters. The optimal trajectories of all the AUVs in the next epoch are determined to maximize a long-term reward that is defined based on the field uncertainty reduction and the AUV mobility cost, subject to the kinematics constraint, the communication constraint and the sensing area constraint. We formulate the adaptive trajectory planning problem as a Markov decision process (MDP). A reinforcement learning-based online learning algorithm is designed to determine the optimal AUV trajectories in a constrained continuous space. Simulation results show that the proposed learning-based trajectory planning algorithm has performance similar to a benchmark method that assumes perfect knowledge of the field hyper-parameters.

摘要

本工作研究了基于在线学习的多自主水下机器人(AUV)轨迹规划,以估计冰层下环境中感兴趣的水参数场。考虑了集中式系统,其中在冰层上引入了几个固定接入点作为 AUV 和远程数据融合中心之间通信的网关。我们将感兴趣的水参数场建模为具有未知超参数的高斯过程。采样的 AUV 轨迹是在逐个epoch 的基础上确定的。在每个epoch 的末尾,接入点将所有 AUV 观测到的场样本中继到融合中心,该中心基于高斯过程回归计算场的后验分布,并估计场超参数。在下一个epoch 中所有 AUV 的最优轨迹是通过最大化基于场不确定性减少和 AUV 移动性成本的长期奖励来确定的,同时受到运动学约束、通信约束和传感区域约束的限制。我们将自适应轨迹规划问题表述为马尔可夫决策过程(MDP)。设计了一种基于强化学习的在线学习算法,以在受限的连续空间中确定最优的 AUV 轨迹。仿真结果表明,所提出的基于学习的轨迹规划算法的性能与假设场超参数完全已知的基准方法相当。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df10/6263807/97b7d2ff7bca/sensors-18-03859-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df10/6263807/fea19d7236bc/sensors-18-03859-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df10/6263807/65440d3cb5af/sensors-18-03859-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df10/6263807/d59414459cf6/sensors-18-03859-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df10/6263807/0d18f7c22cda/sensors-18-03859-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df10/6263807/97b7d2ff7bca/sensors-18-03859-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df10/6263807/fea19d7236bc/sensors-18-03859-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df10/6263807/65440d3cb5af/sensors-18-03859-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df10/6263807/d59414459cf6/sensors-18-03859-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df10/6263807/0d18f7c22cda/sensors-18-03859-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df10/6263807/97b7d2ff7bca/sensors-18-03859-g005.jpg

相似文献

1
Reinforcement Learning-Based Multi-AUV Adaptive Trajectory Planning for Under-Ice Field Estimation.基于强化学习的多自主水下航行器冰下区域估计自适应轨迹规划
Sensors (Basel). 2018 Nov 9;18(11):3859. doi: 10.3390/s18113859.
2
An Adaptive Prediction Target Search Algorithm for Multi-AUVs in an Unknown 3D Environment.一种用于未知三维环境中多自主水下航行器的自适应预测目标搜索算法。
Sensors (Basel). 2018 Nov 9;18(11):3853. doi: 10.3390/s18113853.
3
Attention-Based Meta-Reinforcement Learning for Tracking Control of AUV With Time-Varying Dynamics.基于注意力的元强化学习用于时变动力学自主水下航行器的跟踪控制
IEEE Trans Neural Netw Learn Syst. 2022 Nov;33(11):6388-6401. doi: 10.1109/TNNLS.2021.3079148. Epub 2022 Oct 27.
4
Data-Gathering Scheme Using AUVs in Large-Scale Underwater Sensor Networks: A Multihop Approach.在大规模水下传感器网络中使用自主水下航行器的数据收集方案:一种多跳方法。
Sensors (Basel). 2016 Sep 30;16(10):1626. doi: 10.3390/s16101626.
5
A Probabilistic and Highly Efficient Topology Control Algorithm for Underwater Cooperating AUV Networks.一种用于水下协作自主水下航行器网络的概率高效拓扑控制算法。
Sensors (Basel). 2017 May 4;17(5):1022. doi: 10.3390/s17051022.
6
End-to-End AUV Motion Planning Method Based on Soft Actor-Critic.基于软动作 - 批评家的端到端 AUV 运动规划方法。
Sensors (Basel). 2021 Sep 1;21(17):5893. doi: 10.3390/s21175893.
7
Improved Artificial Potential Field Algorithm Assisted by Multisource Data for AUV Path Planning.多源数据辅助的改进人工势场算法用于自主水下航行器路径规划
Sensors (Basel). 2023 Jul 26;23(15):6680. doi: 10.3390/s23156680.
8
Visual Navigation for Recovering an AUV by Another AUV in Shallow Water.浅水中一艘自主水下航行器对另一艘自主水下航行器回收的视觉导航
Sensors (Basel). 2019 Apr 20;19(8):1889. doi: 10.3390/s19081889.
9
Adaptive optimal trajectory tracking control of AUVs based on reinforcement learning.基于强化学习的 AUV 自适应最优轨迹跟踪控制。
ISA Trans. 2023 Jun;137:122-132. doi: 10.1016/j.isatra.2022.12.003. Epub 2022 Dec 8.
10
Experimental Evaluation on Depth Control Using Improved Model Predictive Control for Autonomous Underwater Vehicle (AUVs).基于改进模型预测控制的自主水下机器人(AUV)深度控制的实验评估。
Sensors (Basel). 2018 Jul 17;18(7):2321. doi: 10.3390/s18072321.

引用本文的文献

1
Real-time Trajectory Planning and Tracking Control of Bionic Underwater Robot in Dynamic Environment.动态环境下仿生水下机器人的实时轨迹规划与跟踪控制
Cyborg Bionic Syst. 2024 May 9;5:0112. doi: 10.34133/cbsystems.0112. eCollection 2024.
2
Simultaneous Control and Guidance of an AUV Based on Soft Actor-Critic.基于软演员-评论家的自主水下航行器同步控制与制导
Sensors (Basel). 2022 Aug 14;22(16):6072. doi: 10.3390/s22166072.

本文引用的文献

1
Dynamic Task Assignment and Path Planning of Multi-AUV System Based on an Improved Self-Organizing Map and Velocity Synthesis Method in Three-Dimensional Underwater Workspace.基于改进的自组织映射和速度合成方法的三维水下作业空间中多 AUV 系统的动态任务分配与路径规划。
IEEE Trans Cybern. 2013 Apr;43(2):504-14. doi: 10.1109/TSMCB.2012.2210212. Epub 2013 Mar 7.