School of Aerospace, Transport and Manufacturing, Cranfield University, Bedford MK43 0AL, UK.
Neural Netw. 2022 Jan;145:33-41. doi: 10.1016/j.neunet.2021.10.009. Epub 2021 Oct 21.
In this paper, a complementary learning scheme for experience transference of unknown continuous-time linear systems is proposed. The algorithm is inspired in the complementary learning properties that exhibit the hippocampus and neocortex learning systems via the striatum. The hippocampus is modelled as pattern-separated data of a human optimized controller. The neocortex is modelled as a Q-reinforcement learning algorithm which improves the hippocampus control policy. The complementary learning (striatum) is designed as an inverse reinforcement learning algorithm which relates the hippocampus and neocortex learning models to seek and transfer the weights of the hidden expert's utility function. Convergence of the proposed approach is analysed using Lyapunov recursions. Simulations are given to verify the proposed approach.
本文提出了一种用于未知连续时间线性系统经验迁移的补充学习方案。该算法的灵感来源于纹状体中展示海马体和新皮层学习系统的互补学习特性。海马体被建模为人类优化控制器的模式分离数据。新皮层被建模为 Q 强化学习算法,该算法改进了海马体控制策略。互补学习(纹状体)被设计为一种逆强化学习算法,它将海马体和新皮层学习模型联系起来,以寻求和转移隐藏专家效用函数的权重。使用 Lyapunov 递归分析了所提出方法的收敛性。给出了仿真结果以验证所提出的方法。