Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology, 2-3-26 Aomi, Koto-ku, Tokyo, Japan.
Neural Netw. 2018 Sep;105:14-25. doi: 10.1016/j.neunet.2018.03.002. Epub 2018 Mar 13.
Hierarchical learning models, such as mixture models and Bayesian networks, are widely employed for unsupervised learning tasks, such as clustering analysis. They consist of observable and latent variables, which represent the given data and their underlying generation process, respectively. It has been pointed out that conventional statistical analysis is not applicable to these models, because redundancy of the latent variable produces singularities in the parameter space. In recent years, a method based on algebraic geometry has allowed us to analyze the accuracy of predicting observable variables when using Bayesian estimation. However, how to analyze latent variables has not been sufficiently studied, even though one of the main issues in unsupervised learning is to determine how accurately the latent variable is estimated. A previous study proposed a method that can be used when the range of the latent variable is redundant compared with the model generating data. The present paper extends that method to the situation in which the latent variables have redundant dimensions. We formulate new error functions and derive their asymptotic forms. Calculation of the error functions is demonstrated in two-layered Bayesian networks.
层次学习模型,如混合模型和贝叶斯网络,广泛应用于无监督学习任务,如聚类分析。它们由可观测变量和潜在变量组成,分别表示给定的数据及其潜在生成过程。已经指出,传统的统计分析不适用于这些模型,因为潜在变量的冗余会在参数空间中产生奇点。近年来,基于代数几何的方法允许我们分析使用贝叶斯估计时对可观测变量进行预测的准确性。然而,如何分析潜在变量还没有得到充分的研究,尽管无监督学习的主要问题之一是确定潜在变量的估计有多准确。之前的一项研究提出了一种方法,当潜在变量的范围相对于生成数据的模型冗余时,可以使用该方法。本文将该方法扩展到潜在变量具有冗余维度的情况。我们构建了新的误差函数,并推导出了它们的渐近形式。在两层贝叶斯网络中演示了误差函数的计算。