Peralta Billy, Saavedra Ariel, Caro Luis, Soto Alvaro
Department of Engineering Science, Andres Bello University, Santiago 7500971, Chile.
Department of Engineering Informatics, Catholic University of Temuco, Temuco 4781312, Chile.
Entropy (Basel). 2019 Feb 18;21(2):190. doi: 10.3390/e21020190.
Today, there is growing interest in the automatic classification of a variety of tasks, such as weather forecasting, product recommendations, intrusion detection, and people recognition. "Mixture-of-experts" is a well-known classification technique; it is a probabilistic model consisting of local expert classifiers weighted by a gate network that is typically based on softmax functions, combined with learnable complex patterns in data. In this scheme, one data point is influenced by only one expert; as a result, the training process can be misguided in real datasets for which complex data need to be explained by multiple experts. In this work, we propose a variant of the regular mixture-of-experts model. In the proposed model, the cost classification is penalized by the Shannon entropy of the gating network in order to avoid a "winner-takes-all" output for the gating network. Experiments show the advantage of our approach using several real datasets, with improvements in mean accuracy of 3-6% in some datasets. In future work, we plan to embed feature selection into this model.
如今,人们对各种任务的自动分类越来越感兴趣,比如天气预报、产品推荐、入侵检测和人员识别。“专家混合”是一种著名的分类技术;它是一种概率模型,由基于通常基于softmax函数的门控网络加权的局部专家分类器组成,并结合数据中可学习的复杂模式。在这种方案中,一个数据点仅受一个专家影响;因此,在需要多个专家来解释复杂数据的真实数据集中,训练过程可能会被误导。在这项工作中,我们提出了常规专家混合模型的一个变体。在所提出的模型中,为了避免门控网络出现“赢家通吃”的输出,成本分类由门控网络的香农熵进行惩罚。实验表明,我们的方法在使用几个真实数据集时具有优势,在某些数据集中平均准确率提高了3% - 6%。在未来的工作中,我们计划将特征选择嵌入到这个模型中。