Ueda Naonori, Ghahramani Zoubin
NTT Communication Science Laboratories, Soraku-gun, Kyoto, Japan.
Neural Netw. 2002 Dec;15(10):1223-41. doi: 10.1016/s0893-6080(02)00040-0.
When learning a mixture model, we suffer from the local optima and model structure determination problems. In this paper, we present a method for simultaneously solving these problems based on the variational Bayesian (VB) framework. First, in the VB framework, we derive an objective function that can simultaneously optimize both model parameter distributions and model structure. Next, focusing on mixture models, we present a deterministic algorithm to approximately optimize the objective function by using the idea of the split and merge operations which we previously proposed within the maximum likelihood framework. Then, we apply the method to mixture of expers (MoE) models to experimentally show that the proposed method can find the optimal number of experts of a MoE while avoiding local maxima.
在学习混合模型时,我们会遇到局部最优和模型结构确定问题。在本文中,我们提出了一种基于变分贝叶斯(VB)框架同时解决这些问题的方法。首先,在VB框架中,我们推导了一个目标函数,该函数可以同时优化模型参数分布和模型结构。接下来,针对混合模型,我们提出一种确定性算法,通过使用我们之前在最大似然框架内提出的分裂和合并操作的思想来近似优化目标函数。然后,我们将该方法应用于专家混合(MoE)模型,通过实验表明所提出的方法可以找到MoE的最优专家数量,同时避免局部最大值。