Suppr超能文献

基于部分策略的强化学习在 3D 医学图像中解剖学地标定位。

Partial Policy-Based Reinforcement Learning for Anatomical Landmark Localization in 3D Medical Images.

出版信息

IEEE Trans Med Imaging. 2020 Apr;39(4):1245-1255. doi: 10.1109/TMI.2019.2946345. Epub 2019 Oct 9.

Abstract

Utilizing the idea of long-term cumulative return, reinforcement learning (RL) has shown remarkable performance in various fields. We follow the formulation of landmark localization in 3D medical images as an RL problem. Whereas value-based methods have been widely used to solve RL-based localization problems, we adopt an actor-critic based direct policy search method framed in a temporal difference learning approach. In RL problems with large state and/or action spaces, learning the optimal behavior is challenging and requires many trials. To improve the learning, we introduce a partial policy-based reinforcement learning to enable solving the large problem of localization by learning the optimal policy on smaller partial domains. Independent actors efficiently learn the corresponding partial policies, each utilizing their own independent critic. The proposed policy reconstruction from the partial policies ensures a robust and efficient localization, where the sub-agents uniformly contribute to the state-transitions based on their simple partial policies mapping to binary actions. Experiments with three different localization problems in 3D CT and MR images showed that the proposed reinforcement learning requires a significantly smaller number of trials to learn the optimal behavior compared to the original behavior learning scheme in RL. It also ensures a satisfactory performance when trained on fewer images.

摘要

利用长期累积回报的思想,强化学习 (RL) 在各个领域都表现出了显著的性能。我们将 3D 医学图像中的地标定位问题表述为一个 RL 问题。虽然基于价值的方法已被广泛用于解决基于 RL 的定位问题,但我们采用了基于 actor-critic 的直接策略搜索方法,该方法采用了时间差分学习方法。在状态和/或动作空间较大的 RL 问题中,学习最优行为具有挑战性,需要进行多次试验。为了提高学习效果,我们引入了部分基于策略的强化学习,以便通过在较小的局部域上学习最优策略来解决本地化的大型问题。独立的智能体可以有效地学习相应的局部策略,每个智能体都使用自己独立的评价器。从局部策略中进行策略重构,确保了稳健高效的本地化,其中子智能体根据其简单的局部策略映射到二进制动作,均匀地促进状态转换。在 3D CT 和 MR 图像中的三个不同定位问题的实验表明,与 RL 中的原始行为学习方案相比,所提出的强化学习需要更少的试验次数来学习最优行为。当在较少的图像上进行训练时,它也能保证令人满意的性能。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验