Faculty of Engineering Technology, Al-Balsa Applied University, Jordan.
Neural Netw. 2011 Dec;24(10):1128-35. doi: 10.1016/j.neunet.2011.06.006. Epub 2011 Jun 22.
Optimal stochastic controller pushes the closed-loop behavior as close as possible to the desired one. The fully probabilistic design (FPD) uses probabilistic description of the desired closed loop and minimizes Kullback-Leibler divergence of the closed-loop description to the desired one. Practical exploitation of the fully probabilistic design control theory continues to be hindered by the computational complexities involved in numerically solving the associated stochastic dynamic programming problem; in particular, very hard multivariate integration and an approximate interpolation of the involved multivariate functions. This paper proposes a new fully probabilistic control algorithm that uses the adaptive critic methods to circumvent the need for explicitly evaluating the optimal value function, thereby dramatically reducing computational requirements. This is a main contribution of this paper.
最优随机控制器尽可能地使闭环行为接近期望行为。完全概率设计(FPD)使用期望闭环的概率描述,并最小化闭环描述与期望的 Kullback-Leibler 散度。完全概率设计控制理论的实际应用仍然受到相关随机动态规划问题数值求解所涉及的计算复杂性的阻碍;特别是非常困难的多元积分和所涉及的多元函数的近似插值。本文提出了一种新的完全概率控制算法,该算法使用自适应评价方法来避免需要显式评估最优值函数,从而大大降低计算要求。这是本文的主要贡献。