Suppr超能文献

利用忆阻器计算加速深度学习。

Accelerating deep learning with memcomputing.

机构信息

Department of Physics, University of California, San Diego, La Jolla, CA 92093, United States.

MemComputing, Inc., San Diego, CA, 92130, United States.

出版信息

Neural Netw. 2019 Feb;110:1-7. doi: 10.1016/j.neunet.2018.10.012. Epub 2018 Nov 3.

Abstract

Restricted Boltzmann machines (RBMs) and their extensions, often called "deep-belief networks", are powerful neural networks that have found applications in the fields of machine learning and artificial intelligence. The standard way to train these models resorts to an iterative unsupervised procedure based on Gibbs sampling, called "contrastive divergence", and additional supervised tuning via back-propagation. However, this procedure has been shown not to follow any gradient and can lead to suboptimal solutions. In this paper, we show an efficient alternative to contrastive divergence by means of simulations of digital memcomputing machines (DMMs) that compute the gradient of the log-likelihood involved in unsupervised training. We test our approach on pattern recognition using a modified version of the MNIST data set of hand-written numbers. DMMs sample effectively the vast phase space defined by the probability distribution of RBMs, and provide a good approximation close to the optimum. This efficient search significantly reduces the number of generative pretraining iterations necessary to achieve a given level of accuracy in the MNIST data set, as well as a total performance gain over the traditional approaches. In fact, the acceleration of the pretraining achieved by simulating DMMs is comparable to, in number of iterations, the recently reported hardware application of the quantum annealing method on the same network and data set. Notably, however, DMMs perform far better than the reported quantum annealing results in terms of quality of the training. Finally, we also compare our method to recent advances in supervised training, like batch-normalization and rectifiers, that seem to reduce the advantage of pretraining. We find that the memcomputing method still maintains a quality advantage (>1% in accuracy, corresponding to a 20% reduction in error rate) over these approaches, despite the network pretrained with memcomputing defines a more non-convex landscape using sigmoidal activation functions without batch-normalization. Our approach is agnostic about the connectivity of the network. Therefore, it can be extended to train full Boltzmann machines, and even deep networks at once.

摘要

受限玻尔兹曼机(RBMs)及其扩展,通常称为“深度置信网络”,是强大的神经网络,已在机器学习和人工智能领域得到应用。训练这些模型的标准方法是基于吉布斯采样的迭代无监督过程,称为“对比散度”,并通过反向传播进行额外的监督调整。然而,这种方法并没有遵循任何梯度,并且可能导致次优解。在本文中,我们通过数字存储机(DMMs)的模拟来展示对比散度的有效替代方法,DMMs 计算无监督训练中涉及的对数似然的梯度。我们使用手写数字的 MNIST 数据集的修改版本在模式识别上测试我们的方法。DMMs 有效地采样 RBM 概率分布定义的巨大相空间,并提供接近最优的良好近似。这种有效的搜索大大减少了在 MNIST 数据集中达到给定精度所需的生成性预训练迭代次数,并且与传统方法相比也具有整体性能增益。事实上,通过模拟 DMMs 实现的预训练加速与量子退火方法在相同网络和数据集上的最近硬件应用相比,在迭代次数上是可比的。值得注意的是,然而,DMMs 在训练质量方面的表现远优于报告的量子退火结果。最后,我们还将我们的方法与最近在监督训练方面的进展进行了比较,例如批量归一化和整流器,这些方法似乎减少了预训练的优势。我们发现,尽管使用没有批量归一化的 sigmoid 激活函数进行预训练的网络定义了更非凸的景观,但存储机方法仍然保持着优于这些方法的质量优势(在准确性方面超过 1%,对应于误差率降低 20%)。我们的方法对网络的连接性是不可知的。因此,它可以扩展到同时训练全玻尔兹曼机甚至深度网络。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验