用于主动目标识别的带过渡管理的视点规划

Viewpoint planning with transition management for active object recognition.

作者信息

Sun Haibo, Zhu Feng, Li Yangyang, Zhao Pengfei, Kong Yanzi, Wang Jianyu, Wan Yingcai, Fu Shuangfei

机构信息

Faculty of Robot Science and Engineering, Northeastern University, Shenyang, China.

Key Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang, China.

出版信息

Front Neurorobot. 2023 Feb 24;17:1093132. doi: 10.3389/fnbot.2023.1093132. eCollection 2023.

DOI:10.3389/fnbot.2023.1093132

PMID:36910268

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9998679/

Abstract

Active object recognition (AOR) provides a paradigm where an agent can capture additional evidence by purposefully changing its viewpoint to improve the quality of recognition. One of the most concerned problems in AOR is viewpoint planning (VP) which refers to developing a policy to determine the next viewpoints of the agent. A research trend is to solve the VP problem with reinforcement learning, namely to use the viewpoint transitions explored by the agent to train the VP policy. However, most research discards the trained transitions, which may lead to an inefficient use of the explored transitions. To solve this challenge, we present a novel VP method with transition management based on reinforcement learning, which can reuse the explored viewpoint transitions. To be specific, a learning framework of the VP policy is first established the deterministic policy gradient theory, which provides an opportunity to reuse the explored transitions. Then, we design a scheme of viewpoint transition management that can store the explored transitions and decide which transitions are used for the policy learning. Finally, within the framework, we develop an algorithm based on twin delayed deep deterministic policy gradient and the designed scheme to train the VP policy. Experiments on the public and challenging dataset GERMS show the effectiveness of our method in comparison with several competing approaches.

摘要

主动目标识别（AOR）提供了一种范例，即智能体可以通过有目的地改变其视角来获取额外证据，以提高识别质量。AOR中最受关注的问题之一是视角规划（VP），它指的是制定一种策略来确定智能体的下一个视角。一种研究趋势是用强化学习来解决VP问题，即利用智能体探索的视角转换来训练VP策略。然而，大多数研究丢弃了训练过的转换，这可能导致对探索到的转换利用效率低下。为了解决这一挑战，我们提出了一种基于强化学习的具有转换管理的新型VP方法，该方法可以重用探索到的视角转换。具体来说，首先基于确定性策略梯度理论建立VP策略的学习框架，这为重用探索到的转换提供了机会。然后，我们设计了一种视角转换管理方案，该方案可以存储探索到的转换，并决定哪些转换用于策略学习。最后，在该框架内，我们开发了一种基于双延迟深度确定性策略梯度和所设计方案的算法来训练VP策略。在公开且具有挑战性的GERMS数据集上进行的实验表明，与几种竞争方法相比，我们的方法是有效的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a10/9998679/a6c7cc5649d7/fnbot-17-1093132-g0001.jpg

相似文献

Viewpoint planning with transition management for active object recognition.用于主动目标识别的带过渡管理的视点规划

Front Neurorobot. 2023 Feb 24;17:1093132. doi: 10.3389/fnbot.2023.1093132. eCollection 2023.

Continuous Viewpoint Planning in Conjunction with Dynamic Exploration for Active Object Recognition.结合动态探索的连续视点规划以实现主动目标识别

Entropy (Basel). 2021 Dec 20;23(12):1702. doi: 10.3390/e23121702.

Model-Based Predictive Control and Reinforcement Learning for Planning Vehicle-Parking Trajectories for Vertical Parking Spaces.基于模型的预测控制与强化学习用于垂直停车位的车辆泊车轨迹规划

Sensors (Basel). 2023 Aug 11;23(16):7124. doi: 10.3390/s23167124.

A Multitasking-Oriented Robot Arm Motion Planning Scheme Based on Deep Reinforcement Learning and Twin Synchro-Control.基于深度强化学习和双同步控制的面向多任务的机械臂运动规划方案。

Sensors (Basel). 2020 Jun 21;20(12):3515. doi: 10.3390/s20123515.

Robot grasping method optimization using improved deep deterministic policy gradient algorithm of deep reinforcement learning.基于深度强化学习的改进深度确定性策略梯度算法的机器人抓取方法优化

Rev Sci Instrum. 2021 Feb 1;92(2):025114. doi: 10.1063/5.0034101.

Prioritized experience replay in path planning via multi-dimensional transition priority fusion.通过多维度转移优先级融合在路径规划中进行优先经验回放。

Front Neurorobot. 2023 Nov 15;17:1281166. doi: 10.3389/fnbot.2023.1281166. eCollection 2023.

Diversity Evolutionary Policy Deep Reinforcement Learning.多样性进化策略深度强化学习。

Comput Intell Neurosci. 2021 Aug 3;2021:5300189. doi: 10.1155/2021/5300189. eCollection 2021.

Approximate Policy-Based Accelerated Deep Reinforcement Learning.基于近似策略的加速深度强化学习

IEEE Trans Neural Netw Learn Syst. 2020 Jun;31(6):1820-1830. doi: 10.1109/TNNLS.2019.2927227. Epub 2019 Aug 6.

Intelligent Land-Vehicle Model Transfer Trajectory Planning Method Based on Deep Reinforcement Learning.基于深度强化学习的智能车模型转换轨迹规划方法。

Sensors (Basel). 2018 Sep 1;18(9):2905. doi: 10.3390/s18092905.

PaCAR: COVID-19 Pandemic Control Decision Making via Large-Scale Agent-Based Modeling and Deep Reinforcement Learning.PaCAR：通过大规模基于代理的建模和深度强化学习进行 COVID-19 大流行控制决策。

Med Decis Making. 2022 Nov;42(8):1064-1077. doi: 10.1177/0272989X221107902. Epub 2022 Jul 1.

引用本文的文献

Adaptive visual-tactile fusion recognition for robotic operation of multi-material system.用于多材料系统机器人操作的自适应视觉-触觉融合识别

Front Neurorobot. 2023 Jun 20;17:1181383. doi: 10.3389/fnbot.2023.1181383. eCollection 2023.

本文引用的文献

Embodied Object Representation Learning and Recognition.具身物体表征学习与识别

Front Neurorobot. 2022 Apr 14;16:840658. doi: 10.3389/fnbot.2022.840658. eCollection 2022.

Generative Models for Active Vision.主动视觉的生成模型。

Front Neurorobot. 2021 Apr 13;15:651432. doi: 10.3389/fnbot.2021.651432. eCollection 2021.

Automatic 3D Bi-Ventricular Segmentation of Cardiac Images by a Shape-Refined Multi- Task Deep Learning Approach.基于形状精修的多任务深度学习方法的心脏图像自动三维双心室分割。

IEEE Trans Med Imaging. 2019 Sep;38(9):2151-2164. doi: 10.1109/TMI.2019.2894322. Epub 2019 Jan 23.

End-to-End Policy Learning for Active Visual Categorization.用于主动视觉分类的端到端策略学习

IEEE Trans Pattern Anal Mach Intell. 2019 Jul;41(7):1601-1614. doi: 10.1109/TPAMI.2018.2840991. Epub 2018 May 28.

Extreme Trust Region Policy Optimization for Active Object Recognition.主动目标识别的极端信任域策略优化。

IEEE Trans Neural Netw Learn Syst. 2018 Jun;29(6):2253-2258. doi: 10.1109/TNNLS.2017.2785233.

Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。

Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于主动目标识别的带过渡管理的视点规划

Viewpoint planning with transition management for active object recognition.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献