Suppr超能文献

自适应评判设计

Adaptive critic designs.

作者信息

Prokhorov D V, Wunsch D C

机构信息

Dept. of Electr. Eng., Texas Tech. Univ., Lubbock, TX.

出版信息

IEEE Trans Neural Netw. 1997;8(5):997-1007. doi: 10.1109/72.623201.

Abstract

We discuss a variety of adaptive critic designs (ACDs) for neurocontrol. These are suitable for learning in noisy, nonlinear, and nonstationary environments. They have common roots as generalizations of dynamic programming for neural reinforcement learning approaches. Our discussion of these origins leads to an explanation of three design families: heuristic dynamic programming, dual heuristic programming, and globalized dual heuristic programming (GDHP). The main emphasis is on DHP and GDHP as advanced ACDs. We suggest two new modifications of the original GDHP design that are currently the only working implementations of GDHP. They promise to be useful for many engineering applications in the areas of optimization and optimal control. Based on one of these modifications, we present a unified approach to all ACDs. This leads to a generalized training procedure for ACDs.

摘要

我们讨论了用于神经控制的多种自适应评判设计(ACD)。这些设计适用于在噪声、非线性和非平稳环境中进行学习。它们有着共同的根源,是神经强化学习方法中动态规划的推广。我们对这些起源的讨论引出了对三个设计家族的解释:启发式动态规划、对偶启发式规划和全局化对偶启发式规划(GDHP)。主要重点是作为先进ACD的DHP和GDHP。我们提出了对原始GDHP设计的两种新修改,它们目前是GDHP仅有的可行实现方式。它们有望在优化和最优控制领域的许多工程应用中发挥作用。基于其中一种修改,我们提出了一种适用于所有ACD的统一方法。这导致了一种用于ACD的广义训练过程。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验