Heydari Ali
IEEE Trans Neural Netw Learn Syst. 2018 Sep;29(9):4522-4527. doi: 10.1109/TNNLS.2017.2755501. Epub 2017 Oct 16.
Adaptive optimal control using value iteration initiated from a stabilizing control policy is theoretically analyzed. The analysis is in terms of stability of the system during the learning stage and includes the system controlled by any fixed control policy and also by an evolving policy. A feature of the presented results is finding subsets of the region of attraction. This is done so that if the initial condition belongs to this region, the entire state trajectory remains within the training region. Therefore, the function approximation results remain reliable, as no extrapolation will be conducted.