Nguyen Hien D, Lloyd-Jones Luke R, McLachlan Geoffrey J
School of Mathematics and Physics, University of Queensland, Brisbane, Queensland 4072, Australia
Centre for Neurogenetics and Statistical Genetics, Queensland Brain Institute, University of Queensland, Brisbane, Queensland 4072, Australia
Neural Comput. 2016 Dec;28(12):2585-2593. doi: 10.1162/NECO_a_00892. Epub 2016 Sep 14.
The mixture-of-experts (MoE) model is a popular neural network architecture for nonlinear regression and classification. The class of MoE mean functions is known to be uniformly convergent to any unknown target function, assuming that the target function is from a Sobolev space that is sufficiently differentiable and that the domain of estimation is a compact unit hypercube. We provide an alternative result, which shows that the class of MoE mean functions is dense in the class of all continuous functions over arbitrary compact domains of estimation. Our result can be viewed as a universal approximation theorem for MoE models. The theorem we present allows MoE users to be confident in applying such models for estimation when data arise from nonlinear and nondifferentiable generative processes.
专家混合(MoE)模型是一种用于非线性回归和分类的流行神经网络架构。已知MoE均值函数类在假设目标函数来自充分可微的Sobolev空间且估计域为紧致单位超立方体的情况下,能一致收敛到任何未知目标函数。我们给出了一个替代结果,表明MoE均值函数类在任意紧致估计域上的所有连续函数类中是稠密的。我们的结果可被视为MoE模型的一个通用逼近定理。我们提出的定理使MoE的使用者在数据来自非线性和不可微生成过程时,能够放心地将此类模型用于估计。