Hinton Geoffrey E, Osindero Simon, Teh Yee-Whye
Department of Computer Science, University of Toronto, Canada.
Neural Comput. 2006 Jul;18(7):1527-54. doi: 10.1162/neco.2006.18.7.1527.
We show how to use "complementary priors" to eliminate the explaining-away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind.
我们展示了如何使用“互补先验”来消除在具有许多隐藏层的密集连接信念网络中使推理变得困难的解释消除效应。利用互补先验,我们推导出一种快速、贪婪的算法,该算法可以一次学习一层深度、有向信念网络,前提是最上面两层形成一个无向联想记忆。这种快速、贪婪的算法用于初始化一个较慢的学习过程,该过程使用醒睡算法的对比版本来微调权重。经过微调后,一个具有三个隐藏层的网络形成了手写数字图像及其标签联合分布的非常好的生成模型。这个生成模型在数字分类方面比最好的判别学习算法表现更好。数字所在的低维流形由顶级联想记忆的自由能景观中的长沟壑建模,并且通过使用有向连接来展示联想记忆的内容,很容易探索这些沟壑。