Huang Haiping, Toyoizumi Taro
RIKEN Brain Science Institute, Wako-shi, Saitama 351-0198, Japan.
Phys Rev E. 2016 Dec;94(6-1):062310. doi: 10.1103/PhysRevE.94.062310. Epub 2016 Dec 21.
Unsupervised neural network learning extracts hidden features from unlabeled training data. This is used as a pretraining step for further supervised learning in deep networks. Hence, understanding unsupervised learning is of fundamental importance. Here, we study the unsupervised learning from a finite number of data, based on the restricted Boltzmann machine where only one hidden neuron is considered. Our study inspires an efficient message-passing algorithm to infer the hidden feature and estimate the entropy of candidate features consistent with the data. Our analysis reveals that the learning requires only a few data if the feature is salient and extensively many if the feature is weak. Moreover, the entropy of candidate features monotonically decreases with data size and becomes negative (i.e., entropy crisis) before the message passing becomes unstable, suggesting a discontinuous phase transition. In terms of convergence time of the message-passing algorithm, the unsupervised learning exhibits an easy-hard-easy phenomenon as the training data size increases. All these properties are reproduced in an approximate Hopfield model, with an exception that the entropy crisis is absent, and only continuous phase transition is observed. This key difference is also confirmed in a handwritten digits dataset. This study deepens our understanding of unsupervised learning from a finite number of data and may provide insights into its role in training deep networks.
无监督神经网络学习从无标签训练数据中提取隐藏特征。这被用作深度网络中进一步监督学习的预训练步骤。因此,理解无监督学习至关重要。在此,我们基于仅考虑一个隐藏神经元的受限玻尔兹曼机,研究从有限数量数据中进行的无监督学习。我们的研究启发了一种高效的消息传递算法,用于推断隐藏特征并估计与数据一致的候选特征的熵。我们的分析表明,如果特征显著,学习仅需要少量数据;如果特征微弱,则需要大量数据。此外,候选特征的熵随数据大小单调递减,并在消息传递变得不稳定之前变为负数(即熵危机),这表明存在不连续的相变。就消息传递算法的收敛时间而言,随着训练数据大小的增加,无监督学习呈现出易-难-易现象。所有这些特性在一个近似霍普菲尔德模型中得到了重现,唯一不同的是不存在熵危机,仅观察到连续相变。这一关键差异在一个手写数字数据集中也得到了证实。这项研究加深了我们对从有限数量数据中进行无监督学习的理解,并可能为其在深度网络训练中的作用提供见解。