Johann Bernoulli Institute for Mathematics and Computer Science, University of Groningen, Groningen, Netherlands.
Neural Comput. 2010 Nov;22(11):2924-61. doi: 10.1162/NECO_a_00030.
A variety of modifications have been employed to learning vector quantization (LVQ) algorithms using either crisp or soft windows for selection of data. Although these schemes have been shown in practice to improve performance, a theoretical study on the influence of windows has so far been limited. Here we rigorously analyze the influence of windows in a controlled environment of gaussian mixtures in high dimensions. Concepts from statistical physics and the theory of online learning allow an exact description of the training dynamics, yielding typical learning curves, convergence properties, and achievable generalization abilities. We compare the performance and demonstrate the advantages of various algorithms, including LVQ 2.1, generalized LVQ (GLVQ), Learning from Mistakes (LFM) and Robust Soft LVQ (RSLVQ). We find that the selection of the window parameter highly influences the learning curves but not, surprisingly, the asymptotic performances of LVQ 2.1 and RSLVQ. Although the prototypes of LVQ 2.1 exhibit divergent behavior, the resulting decision boundary coincides with the optimal decision boundary, thus yielding optimal generalization ability.
已经采用了各种修改方法来学习矢量量化 (LVQ) 算法,这些方法使用硬或软窗口来选择数据。尽管这些方案在实践中已经被证明可以提高性能,但到目前为止,对窗口影响的理论研究还很有限。在这里,我们在高斯混合的高维受控环境中严格分析了窗口的影响。统计物理学和在线学习理论的概念允许对训练动态进行精确描述,从而产生典型的学习曲线、收敛特性和可实现的泛化能力。我们比较了性能并展示了各种算法的优势,包括 LVQ 2.1、广义 LVQ (GLVQ)、从错误中学习 (LFM) 和稳健软 LVQ (RSLVQ)。我们发现,窗口参数的选择高度影响学习曲线,但令人惊讶的是,LVQ 2.1 和 RSLVQ 的渐近性能不受影响。尽管 LVQ 2.1 的原型表现出发散行为,但得到的决策边界与最优决策边界一致,从而产生最优的泛化能力。