Li Xiaoxu, Li Xiaoxu, Chang Dongliang, Ma Zhanyu, Tan Zheng-Hua, Xue Jing-Hao, Cao Jie, Yu Jingyi, Guo Jun
IEEE Trans Image Process. 2020 May 6. doi: 10.1109/TIP.2020.2990277.
A deep neural network of multiple nonlinear layers forms a large function space, which can easily lead to overfitting when it encounters small-sample data. To mitigate overfitting in small-sample classification, learning more discriminative features from small-sample data is becoming a new trend. To this end, this paper aims to find a subspace of neural networks that can facilitate a large decision margin. Specifically, we propose the Orthogonal Softmax Layer (OSL), which makes the weight vectors in the classification layer remain orthogonal during both the training and test processes. The Rademacher complexity of a network using the OSL is only 1/K, where K is the number of classes, of that of a network using the fully connected classification layer, leading to a tighter generalization error bound. Experimental results demonstrate that the proposed OSL has better performance than the methods used for comparison on four small-sample benchmark datasets, as well as its applicability to large-sample datasets. Codes are available at: https://github.com/dongliangchang/OSLNet.
由多个非线性层组成的深度神经网络形成了一个大的函数空间,当遇到小样本数据时很容易导致过拟合。为了减轻小样本分类中的过拟合问题,从小样本数据中学习更具判别力的特征正成为一种新趋势。为此,本文旨在找到神经网络的一个子空间,该子空间能够促进较大的决策边界。具体而言,我们提出了正交Softmax层(OSL),它使分类层中的权重向量在训练和测试过程中都保持正交。使用OSL的网络的拉德马赫复杂度仅为使用全连接分类层的网络的1/K(其中K是类别数),从而导致更紧的泛化误差界。实验结果表明,所提出的OSL在四个小样本基准数据集上比用于比较的方法具有更好的性能,并且其对大样本数据集也适用。代码可在以下网址获取:https://github.com/dongliangchang/OSLNet。