IEEE Trans Neural Netw Learn Syst. 2019 Sep;30(9):2650-2661. doi: 10.1109/TNNLS.2018.2885972. Epub 2019 Jan 7.
This paper proposes a new and efficient technique to regularize the neural network in the context of deep learning using correlations among features. Previous studies have shown that oversized deep neural network models tend to produce a lot of redundant features that are either the shifted version of one another or are very similar and show little or no variations, thus resulting in redundant filtering. We propose a way to address this problem and show that such redundancy can be avoided using regularization and adaptive feature dropout mechanism. We show that regularizing both negative and positive correlated features according to their differentiation and based on their relative cosine distances yields network extracting dissimilar features with less overfitting and better generalization. This concept is illustrated with deep multilayer perceptron, convolutional neural network, sparse autoencoder, gated recurrent unit, and long short-term memory on MNIST digits recognition, CIFAR-10, ImageNet, and Stanford Natural Language Inference data sets.
本文提出了一种新的、有效的技术,用于通过特征之间的相关性对深度学习中的神经网络进行正则化。以前的研究表明,过大的深度神经网络模型往往会产生很多冗余特征,这些特征要么是彼此的移位版本,要么非常相似,几乎没有变化,因此导致冗余过滤。我们提出了一种解决这个问题的方法,并表明可以通过正则化和自适应特征丢弃机制来避免这种冗余。我们表明,根据差异对正负相关特征进行正则化,并根据它们的相对余弦距离进行正则化,可以使网络提取出具有较少过拟合和更好泛化能力的不同特征。这一概念在 MNIST 数字识别、CIFAR-10、ImageNet 和斯坦福自然语言推理数据集上的深度多层感知机、卷积神经网络、稀疏自动编码器、门控递归单元和长短期记忆网络中得到了说明。