Kwok James T, Zhang Kai
Lawrence Berkeley National Laboratory,Berkeley, CA 94720, USA.
IEEE Trans Neural Netw. 2010 Apr;21(4):644-58. doi: 10.1109/TNN.2010.2040835. Epub 2010 Feb 22.
The finite mixture model is widely used in various statistical learning problems. However, the model obtained may contain a large number of components, making it inefficient in practical applications. In this paper, we propose to simplify the mixture model by minimizing an upper bound of the approximation error between the original and the simplified model, under the use of the L (2) distance measure. This is achieved by first grouping similar components together and then performing local fitting through function approximation. The simplified model obtained can then be used as a replacement of the original model to speed up various algorithms involving mixture models during training (e.g., Bayesian filtering, belief propagation) and testing [e.g., kernel density estimation, support vector machine (SVM) testing]. Encouraging results are observed in the experiments on density estimation, clustering-based image segmentation, and simplification of SVM decision functions.
有限混合模型广泛应用于各种统计学习问题。然而,得到的模型可能包含大量的组件,这使得它在实际应用中效率低下。在本文中,我们建议通过使用L(2)距离度量,最小化原始模型和简化模型之间的近似误差的上界,来简化混合模型。这首先通过将相似的组件分组在一起,然后通过函数逼近进行局部拟合来实现。然后,得到的简化模型可以用作原始模型的替代,以加快在训练期间(例如,贝叶斯滤波、信念传播)和测试(例如,核密度估计、支持向量机(SVM)测试)中涉及混合模型的各种算法。在密度估计、基于聚类的图像分割和SVM决策函数简化的实验中观察到了令人鼓舞的结果。