An P E, Brown M, Harris C J
Dept. of Electron. and Comput. Sci., Southampton Univ.
IEEE Trans Neural Netw. 1995;6(6):1549-51. doi: 10.1109/72.471354.
Supervised parameter adaptation in many artificial neural networks is largely based on an instantaneous version of gradient descent called the least-mean-square (LMS) algorithm. This paper considers only neural models which are linear with respect to their adaptable parameters and has two major contributions. First, it derives an expression for the gradient-noise covariance under the assumption that the input samples are real, stationary, Gaussian distributed but can be partially correlated. This expression relates the gradient correlation and input correlation matrices to the gradient-noise covariance and explains why the gradient noise generally correlates maximally with the steepest principal axis and minimally with the one of the smallest curvature, regardless of the magnitude of the weight error. Second, a recursive expression for the weight-error correlation matrix is derived in a straightforward manner using the gradient-noise covariance, and comparisons are drawn with the complex LMS algorithm.
许多人工神经网络中的监督参数自适应在很大程度上基于一种称为最小均方(LMS)算法的梯度下降的瞬时版本。本文仅考虑关于其可适应参数呈线性的神经模型,并做出了两项主要贡献。首先,在输入样本是实的、平稳的、高斯分布但可以部分相关的假设下,推导了梯度噪声协方差的表达式。该表达式将梯度相关矩阵和输入相关矩阵与梯度噪声协方差联系起来,并解释了为什么无论权重误差的大小如何,梯度噪声通常与最陡主轴的相关性最大,而与最小曲率的主轴的相关性最小。其次,使用梯度噪声协方差以直接的方式推导了权重误差相关矩阵的递归表达式,并与复数LMS算法进行了比较。