Balakrishnan S N, Ding Jie, Lewis Frank L
Department of Mechanical and Aerospace Engineering, Missouri University of Science and Technology, Rolla, MO 65401, USA.
IEEE Trans Syst Man Cybern B Cybern. 2008 Aug;38(4):913-7. doi: 10.1109/TSMCB.2008.926599.
This paper traces the development of neural-network (NN)-based feedback controllers that are derived from the principle of adaptive/approximate dynamic programming (ADP) and discusses their closed-loop stability. Different versions of NN structures in the literature, which embed mathematical mappings related to solutions of the ADP-formulated problems called "adaptive critics" or "action-critic" networks, are discussed. Distinction between the two classes of ADP applications is pointed out. Furthermore, papers in "model-free" development and model-based neurocontrollers are reviewed in terms of their contributions to stability issues. Recent literature suggests that work in ADP-based feedback controllers with assured stability is growing in diverse forms.
本文追溯了基于神经网络(NN)的反馈控制器的发展历程,这些控制器源自自适应/近似动态规划(ADP)原理,并讨论了它们的闭环稳定性。文中讨论了文献中不同版本的神经网络结构,这些结构嵌入了与被称为“自适应评判器”或“动作评判器”网络的ADP公式化问题的解相关的数学映射。指出了这两类ADP应用之间的区别。此外,还从“无模型”发展和基于模型的神经控制器对稳定性问题的贡献方面对相关论文进行了综述。近期文献表明,具有可靠稳定性的基于ADP的反馈控制器的研究正以多种形式不断发展。