自适应评判设计

Adaptive critic designs.

作者信息

Prokhorov D V, Wunsch D C

机构信息

Dept. of Electr. Eng., Texas Tech. Univ., Lubbock, TX.

出版信息

IEEE Trans Neural Netw. 1997;8(5):997-1007. doi: 10.1109/72.623201.

DOI:10.1109/72.623201

PMID:18255702

Abstract

We discuss a variety of adaptive critic designs (ACDs) for neurocontrol. These are suitable for learning in noisy, nonlinear, and nonstationary environments. They have common roots as generalizations of dynamic programming for neural reinforcement learning approaches. Our discussion of these origins leads to an explanation of three design families: heuristic dynamic programming, dual heuristic programming, and globalized dual heuristic programming (GDHP). The main emphasis is on DHP and GDHP as advanced ACDs. We suggest two new modifications of the original GDHP design that are currently the only working implementations of GDHP. They promise to be useful for many engineering applications in the areas of optimization and optimal control. Based on one of these modifications, we present a unified approach to all ACDs. This leads to a generalized training procedure for ACDs.

摘要

我们讨论了用于神经控制的多种自适应评判设计（ACD）。这些设计适用于在噪声、非线性和非平稳环境中进行学习。它们有着共同的根源，是神经强化学习方法中动态规划的推广。我们对这些起源的讨论引出了对三个设计家族的解释：启发式动态规划、对偶启发式规划和全局化对偶启发式规划（GDHP）。主要重点是作为先进ACD的DHP和GDHP。我们提出了对原始GDHP设计的两种新修改，它们目前是GDHP仅有的可行实现方式。它们有望在优化和最优控制领域的许多工程应用中发挥作用。基于其中一种修改，我们提出了一种适用于所有ACD的统一方法。这导致了一种用于ACD的广义训练过程。

相似文献

Adaptive critic designs.自适应评判设计

IEEE Trans Neural Netw. 1997;8(5):997-1007. doi: 10.1109/72.623201.

Gr-GDHP: A New Architecture for Globalized Dual Heuristic Dynamic Programming.Gr-GDHP：一种全球化双启发式动态规划的新架构。

IEEE Trans Cybern. 2017 Oct;47(10):3318-3330. doi: 10.1109/TCYB.2016.2598282. Epub 2016 Sep 19.

Comparison of heuristic dynamic programming and dual heuristic programming adaptive critics for neurocontrol of a turbogenerator.

IEEE Trans Neural Netw. 2002;13(3):764-73. doi: 10.1109/TNN.2002.1000146.

Efficient Online Globalized Dual Heuristic Programming With an Associated Dual Network.基于关联对偶网络的高效在线全球化对偶启发式规划

IEEE Trans Neural Netw Learn Syst. 2023 Dec;34(12):10079-10090. doi: 10.1109/TNNLS.2022.3164727. Epub 2023 Nov 30.

Online learning control using adaptive critic designs with sparse kernel machines.基于稀疏核机器的自适应评论家设计的在线学习控制。

IEEE Trans Neural Netw Learn Syst. 2013 May;24(5):762-75. doi: 10.1109/TNNLS.2012.2236354.

Simple and fast calculation of the second-order gradients for globalized dual heuristic dynamic programming in neural networks.神经网络中全局化双启发式动态规划二阶梯度的简单快速计算。

IEEE Trans Neural Netw Learn Syst. 2012 Oct;23(10):1671-6. doi: 10.1109/TNNLS.2012.2205268.

Dynamic re-optimization of a fed-batch fermentor using adaptive critic designs.基于自适应评判设计的分批补料发酵罐动态重新优化

IEEE Trans Neural Netw. 2001;12(6):1433-44. doi: 10.1109/72.963778.

An equivalence between adaptive dynamic programming with a critic and backpropagation through time.自适应动态规划与时间反向传播的等价性。

IEEE Trans Neural Netw Learn Syst. 2013 Dec;24(12):2088-100. doi: 10.1109/TNNLS.2013.2271778.

Adaptive learning in tracking control based on the dual critic network design.基于双 Critic 网络设计的跟踪控制自适应学习。

IEEE Trans Neural Netw Learn Syst. 2013 Jun;24(6):913-28. doi: 10.1109/TNNLS.2013.2247627.

Model-Free Adaptive Control for Unknown Nonlinear Zero-Sum Differential Game.无模型自适应控制在未知非线性零和微分对策中的应用

IEEE Trans Cybern. 2018 May;48(5):1633-1646. doi: 10.1109/TCYB.2017.2712617. Epub 2017 Jul 17.

引用本文的文献

Artificial Development by Reinforcement Learning Can Benefit From Multiple Motivations.通过强化学习进行的人工开发可以从多种动机中受益。

Front Robot AI. 2019 Feb 14;6:6. doi: 10.3389/frobt.2019.00006. eCollection 2019.

Future of seizure prediction and intervention: closing the loop.癫痫发作预测与干预的未来：实现闭环

J Clin Neurophysiol. 2015 Jun;32(3):194-206. doi: 10.1097/WNP.0000000000000139.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

自适应评判设计

Adaptive critic designs.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献