Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba, Japan.
Neural Netw. 2010 Mar;23(2):219-25. doi: 10.1016/j.neunet.2009.11.013. Epub 2009 Dec 2.
Node perturbation learning has been receiving much attention as a method for achieving stochastic gradient descent. As it does not require direct gradient calculations, it can be applied to a reinforcement learning framework. However, in conventional node perturbation learning, the residual error due to perturbation is not eliminated even after convergence. Using infinitesimal perturbations suppresses the residual error, but such perturbations are less robust against uncertainty and noise in an eligibility trace, which is a memory of perturbation and input. We derive an optimal parameter schedule for node perturbation learning used with linear perceptrons with uncertainty in the eligibility trace. Our adaptive learning rule resolves the trade-off between robustness against the uncertainty and residual error reduction. The results obtained will be useful in designing learning rules and interpreting related biological knowledge.
节点扰动学习作为一种实现随机梯度下降的方法受到了广泛关注。由于它不需要直接的梯度计算,因此可以应用于强化学习框架。然而,在传统的节点扰动学习中,即使在收敛后,由于扰动而产生的残差也不会被消除。使用无穷小的扰动可以抑制残差,但这种扰动对候选迹(对扰动和输入的记忆)中的不确定性和噪声的鲁棒性较差。我们推导出了具有候选迹不确定性的线性感知器的节点扰动学习的最优参数调度。我们的自适应学习规则解决了鲁棒性与残差减少之间的权衡问题。所得结果将有助于设计学习规则和解释相关的生物学知识。