深度学习神经网络预训练中的稀疏性分析。

IEEE Trans Neural Netw Learn Syst. 2017 Jun;28(6):1425-1438. doi: 10.1109/TNNLS.2016.2541681. Epub 2016 Mar 31.

A major progress in deep multilayer neural networks (DNNs) is the invention of various unsupervised pretraining methods to initialize network parameters which lead to good prediction accuracy. This paper presents the sparseness analysis on the hidden unit in the pretraining process. In particular, we use the L -norm to measure sparseness and provide some sufficient conditions for that pretraining leads to sparseness with respect to the popular pretraining models-such as denoising autoencoders (DAEs) and restricted Boltzmann machines (RBMs). Our experimental results demonstrate that when the sufficient conditions are satisfied, the pretraining models lead to sparseness. Our experiments also reveal that when using the sigmoid activation functions, pretraining plays an important sparseness role in DNNs with sigmoid (Dsigm), and when using the rectifier linear unit (ReLU) activation functions, pretraining becomes less effective for DNNs with ReLU (Drelu). Luckily, Drelu can reach a higher recognition accuracy than DNNs with pretraining (DAEs and RBMs), as it can capture the main benefit (such as sparseness-encouraging) of pretraining in Dsigm. However, ReLU is not adapted to the different firing rates in biological neurons, because the firing rate actually changes along with the varying membrane resistances. To address this problem, we further propose a family of rectifier piecewise linear units (RePLUs) to fit the different firing rates. The experimental results show that the performance of RePLU is better than ReLU, and is comparable with those with some pretraining techniques, such as RBMs and DAEs.

深度多层神经网络 (DNNs) 的一个主要进展是发明了各种无监督预训练方法来初始化网络参数，从而获得良好的预测精度。本文对预训练过程中的隐藏单元稀疏性进行了分析。具体来说，我们使用 L-范数来衡量稀疏性，并为一些流行的预训练模型（如去噪自动编码器 (DAE) 和受限玻尔兹曼机 (RBM)）提供了预训练导致稀疏性的充分条件。我们的实验结果表明，当满足充分条件时，预训练模型会导致稀疏性。我们的实验还表明，当使用 Sigmoid 激活函数时，预训练在具有 Sigmoid (Dsigm) 的 DNNs 中起着重要的稀疏作用，而当使用整流线性单元 (ReLU) 激活函数时，预训练对于具有 ReLU (Drelu) 的 DNNs 的效果就不那么明显了。幸运的是，Drelu 可以达到比具有预训练 (DAE 和 RBM) 的 DNNs 更高的识别准确率，因为它可以在 Dsigm 中捕获预训练的主要优势（如稀疏性鼓励）。然而，ReLU 并不适应生物神经元的不同放电率，因为放电率实际上会随着膜电阻的变化而变化。为了解决这个问题，我们进一步提出了一系列整流分段线性单元 (RePLU) 来适应不同的放电率。实验结果表明，RePLU 的性能优于 ReLU，并且与一些预训练技术（如 RBM 和 DAE）相当。

相似文献

Sparseness Analysis in the Pretraining of Deep Neural Networks.

IEEE Trans Neural Netw Learn Syst. 2017 Jun;28(6):1425-1438. doi: 10.1109/TNNLS.2016.2541681. Epub 2016 Mar 31.

Accelerating deep learning with memcomputing.

Neural Netw. 2019 Feb;110:1-7. doi: 10.1016/j.neunet.2018.10.012. Epub 2018 Nov 3.

A comparison of deep networks with ReLU activation function and linear spline-type methods.

Neural Netw. 2019 Feb;110:232-242. doi: 10.1016/j.neunet.2018.11.005. Epub 2018 Dec 4.

The Role of Architectural and Learning Constraints in Neural Network Models: A Case Study on Visual Space Coding.

Front Comput Neurosci. 2017 Mar 21;11:13. doi: 10.3389/fncom.2017.00013. eCollection 2017.

Deep Neural Networks with Multistate Activation Functions.

Comput Intell Neurosci. 2015;2015:721367. doi: 10.1155/2015/721367. Epub 2015 Sep 10.

Approximation properties of Gaussian-binary restricted Boltzmann machines and Gaussian-binary deep belief networks.

Neural Netw. 2022 Sep;153:49-63. doi: 10.1016/j.neunet.2022.05.020. Epub 2022 Jun 2.

Optimal approximation of piecewise smooth functions using deep ReLU neural networks.

Neural Netw. 2018 Dec;108:296-330. doi: 10.1016/j.neunet.2018.08.019. Epub 2018 Sep 7.

Measuring the usefulness of hidden units in Boltzmann machines with mutual information.

Neural Netw. 2015 Apr;64:12-8. doi: 10.1016/j.neunet.2014.09.004. Epub 2014 Sep 28.

Noise can speed backpropagation learning and deep bidirectional pretraining.

Neural Netw. 2020 Sep;129:359-384. doi: 10.1016/j.neunet.2020.04.004. Epub 2020 Apr 11.

Neural networks with ReLU powers need less depth.

Neural Netw. 2024 Apr;172:106073. doi: 10.1016/j.neunet.2023.12.027. Epub 2023 Dec 19.

引用本文的文献

iEnhancer-DCLA: using the original sequence to identify enhancers and their strength based on a deep learning framework.

BMC Bioinformatics. 2022 Nov 14;23(1):480. doi: 10.1186/s12859-022-05033-x.

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

相似文献

Sparseness Analysis in the Pretraining of Deep Neural Networks.

IEEE Trans Neural Netw Learn Syst. 2017 Jun;28(6):1425-1438. doi: 10.1109/TNNLS.2016.2541681. Epub 2016 Mar 31.

Accelerating deep learning with memcomputing.

Neural Netw. 2019 Feb;110:1-7. doi: 10.1016/j.neunet.2018.10.012. Epub 2018 Nov 3.

A comparison of deep networks with ReLU activation function and linear spline-type methods.

Neural Netw. 2019 Feb;110:232-242. doi: 10.1016/j.neunet.2018.11.005. Epub 2018 Dec 4.

The Role of Architectural and Learning Constraints in Neural Network Models: A Case Study on Visual Space Coding.

Front Comput Neurosci. 2017 Mar 21;11:13. doi: 10.3389/fncom.2017.00013. eCollection 2017.

Deep Neural Networks with Multistate Activation Functions.

Comput Intell Neurosci. 2015;2015:721367. doi: 10.1155/2015/721367. Epub 2015 Sep 10.

Approximation properties of Gaussian-binary restricted Boltzmann machines and Gaussian-binary deep belief networks.

Neural Netw. 2022 Sep;153:49-63. doi: 10.1016/j.neunet.2022.05.020. Epub 2022 Jun 2.

Optimal approximation of piecewise smooth functions using deep ReLU neural networks.

Neural Netw. 2018 Dec;108:296-330. doi: 10.1016/j.neunet.2018.08.019. Epub 2018 Sep 7.

Measuring the usefulness of hidden units in Boltzmann machines with mutual information.

Neural Netw. 2015 Apr;64:12-8. doi: 10.1016/j.neunet.2014.09.004. Epub 2014 Sep 28.

Noise can speed backpropagation learning and deep bidirectional pretraining.

Neural Netw. 2020 Sep;129:359-384. doi: 10.1016/j.neunet.2020.04.004. Epub 2020 Apr 11.

Neural networks with ReLU powers need less depth.

Neural Netw. 2024 Apr;172:106073. doi: 10.1016/j.neunet.2023.12.027. Epub 2023 Dec 19.

引用本文的文献

iEnhancer-DCLA: using the original sequence to identify enhancers and their strength based on a deep learning framework.

BMC Bioinformatics. 2022 Nov 14;23(1):480. doi: 10.1186/s12859-022-05033-x.

Sparseness Analysis in the Pretraining of Deep Neural Networks.

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献