文献检索，用中文搜 PubMed

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

IEEE Trans Neural Netw Learn Syst. 2020 Oct;31(10):4374-4380. doi: 10.1109/TNNLS.2019.2948892. Epub 2019 Nov 22.

The deep Q-network (DQN) and return-based reinforcement learning are two promising algorithms proposed in recent years. The DQN brings advances to complex sequential decision problems, while return-based algorithms have advantages in making use of sample trajectories. In this brief, we propose a general framework to combine the DQN and most of the return-based reinforcement learning algorithms, named R-DQN. We show that the performance of the traditional DQN can be significantly improved by introducing return-based algorithms. In order to further improve the R-DQN, we design a strategy with two measurements to qualitatively measure the policy discrepancy. We conduct experiments on several representative tasks from the OpenAI Gym and Atari games. The state-of-the-art performance achieved by our method with this proposed strategy validates its effectiveness.

深度 Q 网络（DQN）和基于回报的强化学习是近年来提出的两种很有前途的算法。DQN 为复杂的顺序决策问题带来了进步，而基于回报的算法则在利用样本轨迹方面具有优势。在本简讯中，我们提出了一个将 DQN 和大多数基于回报的强化学习算法相结合的通用框架，名为 R-DQN。我们表明，通过引入基于回报的算法，传统的 DQN 的性能可以得到显著提高。为了进一步提高 R-DQN 的性能，我们设计了一种具有两个度量的策略，定性地衡量策略差异。我们在 OpenAI Gym 和 Atari 游戏中的几个有代表性的任务上进行了实验。我们提出的策略的方法所达到的最先进的性能验证了其有效性。

相似文献

Qualitative Measurements of Policy Discrepancy for Return-Based Deep Q-Network.基于回报的深度 Q 网络的政策差异的定性测量。

IEEE Trans Neural Netw Learn Syst. 2020 Oct;31(10):4374-4380. doi: 10.1109/TNNLS.2019.2948892. Epub 2019 Nov 22.

Minibatch Recursive Least Squares Q-Learning.小批量递归最小二乘 Q 学习。

Comput Intell Neurosci. 2021 Oct 8;2021:5370281. doi: 10.1155/2021/5370281. eCollection 2021.

Approximate Policy-Based Accelerated Deep Reinforcement Learning.基于近似策略的加速深度强化学习

IEEE Trans Neural Netw Learn Syst. 2020 Jun;31(6):1820-1830. doi: 10.1109/TNNLS.2019.2927227. Epub 2019 Aug 6.

Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning.受限深度Q学习逐步逼近普通Q学习。

Front Neurorobot. 2019 Dec 10;13:103. doi: 10.3389/fnbot.2019.00103. eCollection 2019.

Sigmoid-weighted linear units for neural network function approximation in reinforcement learning.在强化学习中用于神经网络函数逼近的 Sigmoid 加权线性单元。

Neural Netw. 2018 Nov;107:3-11. doi: 10.1016/j.neunet.2017.12.012. Epub 2018 Jan 11.

MonkeyKing: Adaptive Parameter Tuning on Big Data Platforms with Deep Reinforcement Learning.孙悟空：基于深度强化学习的大数据平台自适应参数调整。

Big Data. 2020 Aug;8(4):270-290. doi: 10.1089/big.2019.0123. Epub 2020 Jul 10.

Deep reinforcement learning for automated radiation adaptation in lung cancer.深度强化学习在肺癌放射自适应中的应用。

Med Phys. 2017 Dec;44(12):6690-6705. doi: 10.1002/mp.12625. Epub 2017 Nov 14.

Multisource Transfer Double DQN Based on Actor Learning.基于演员学习的多源转移双 DQN。

IEEE Trans Neural Netw Learn Syst. 2018 Jun;29(6):2227-2238. doi: 10.1109/TNNLS.2018.2806087.

Teleconsultation dynamic scheduling with a deep reinforcement learning approach.基于深度强化学习的远程会诊动态调度。

Artif Intell Med. 2024 Mar;149:102806. doi: 10.1016/j.artmed.2024.102806. Epub 2024 Feb 9.

Self-Paced Prioritized Curriculum Learning With Coverage Penalty in Deep Reinforcement Learning.深度强化学习中的自定步调和带覆盖惩罚的优先级课程学习。

IEEE Trans Neural Netw Learn Syst. 2018 Jun;29(6):2216-2226. doi: 10.1109/TNNLS.2018.2790981.

IEEE Trans Neural Netw Learn Syst. 2020 Oct;31(10):4374-4380. doi: 10.1109/TNNLS.2019.2948892. Epub 2019 Nov 22.

相似文献

Qualitative Measurements of Policy Discrepancy for Return-Based Deep Q-Network.基于回报的深度 Q 网络的政策差异的定性测量。

IEEE Trans Neural Netw Learn Syst. 2020 Oct;31(10):4374-4380. doi: 10.1109/TNNLS.2019.2948892. Epub 2019 Nov 22.

Minibatch Recursive Least Squares Q-Learning.小批量递归最小二乘 Q 学习。

Comput Intell Neurosci. 2021 Oct 8;2021:5370281. doi: 10.1155/2021/5370281. eCollection 2021.

Approximate Policy-Based Accelerated Deep Reinforcement Learning.基于近似策略的加速深度强化学习

IEEE Trans Neural Netw Learn Syst. 2020 Jun;31(6):1820-1830. doi: 10.1109/TNNLS.2019.2927227. Epub 2019 Aug 6.

Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning.受限深度Q学习逐步逼近普通Q学习。

Front Neurorobot. 2019 Dec 10;13:103. doi: 10.3389/fnbot.2019.00103. eCollection 2019.

Sigmoid-weighted linear units for neural network function approximation in reinforcement learning.在强化学习中用于神经网络函数逼近的 Sigmoid 加权线性单元。

Neural Netw. 2018 Nov;107:3-11. doi: 10.1016/j.neunet.2017.12.012. Epub 2018 Jan 11.

MonkeyKing: Adaptive Parameter Tuning on Big Data Platforms with Deep Reinforcement Learning.孙悟空：基于深度强化学习的大数据平台自适应参数调整。

Big Data. 2020 Aug;8(4):270-290. doi: 10.1089/big.2019.0123. Epub 2020 Jul 10.

Deep reinforcement learning for automated radiation adaptation in lung cancer.深度强化学习在肺癌放射自适应中的应用。

Med Phys. 2017 Dec;44(12):6690-6705. doi: 10.1002/mp.12625. Epub 2017 Nov 14.

Multisource Transfer Double DQN Based on Actor Learning.基于演员学习的多源转移双 DQN。

IEEE Trans Neural Netw Learn Syst. 2018 Jun;29(6):2227-2238. doi: 10.1109/TNNLS.2018.2806087.

Teleconsultation dynamic scheduling with a deep reinforcement learning approach.基于深度强化学习的远程会诊动态调度。

Artif Intell Med. 2024 Mar;149:102806. doi: 10.1016/j.artmed.2024.102806. Epub 2024 Feb 9.

Self-Paced Prioritized Curriculum Learning With Coverage Penalty in Deep Reinforcement Learning.深度强化学习中的自定步调和带覆盖惩罚的优先级课程学习。

IEEE Trans Neural Netw Learn Syst. 2018 Jun;29(6):2216-2226. doi: 10.1109/TNNLS.2018.2790981.

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于回报的深度 Q 网络的政策差异的定性测量。

Qualitative Measurements of Policy Discrepancy for Return-Based Deep Q-Network.

出版信息

相似文献

基于回报的深度 Q 网络的政策差异的定性测量。

Qualitative Measurements of Policy Discrepancy for Return-Based Deep Q-Network.

出版信息

相似文献