Department of Mathematics, University of California at Los Angeles, Los Angeles, CA 90095, USA.
Neural Comput. 2013 Jan;25(1):123-56. doi: 10.1162/NECO_a_00387. Epub 2012 Sep 28.
In this letter, a novel critic-like algorithm was developed to extend the synaptic plasticity rule described in Florian (2007) and Izhikevich (2007) in order to solve the problem of learning multiple distal rewards simultaneously. The system is augmented with short-term plasticity (STP) to stabilize the learning dynamics, thereby increasing the system's learning capacity. A theoretical threshold is estimated for the number of distal rewards that this system can learn. The validity of the novel algorithm was verified by computer simulations.
在这封信中,开发了一种新颖的评论家样算法,以扩展 Florian(2007)和 Izhikevich(2007)中描述的突触可塑性规则,从而解决同时学习多个远距离奖励的问题。该系统增加了短期可塑性(STP)以稳定学习动态,从而提高了系统的学习能力。估计了该系统可以学习的远距离奖励数量的理论阈值。通过计算机模拟验证了新算法的有效性。