Su Jiahao, Li Jingling, Liu Xiaoyu, Ranadive Teresa, Coley Christopher, Tuan Tai-Ching, Huang Furong
Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, United States.
Department of Computer Science, University of Maryland, College Park, MD, United States.
Front Artif Intell. 2022 Mar 8;5:728761. doi: 10.3389/frai.2022.728761. eCollection 2022.
We propose a framework of tensorial neural networks (TNNs) extending existing linear layers on low-order tensors to multilinear operations on higher-order tensors. TNNs have three advantages over existing networks: First, TNNs naturally apply to higher-order data without flattening, which preserves their multi-dimensional structures. Second, compressing a pre-trained network into a TNN results in a model with similar expressive power but fewer parameters. Finally, TNNs interpret advanced compact designs of network architectures, such as bottleneck modules and interleaved group convolutions. To learn TNNs, we derive their backpropagation rules using a novel suite of generalized tensor algebra. With backpropagation, we can either learn TNNs from scratch or pre-trained models using knowledge distillation. Experiments on VGG, ResNet, and Wide-ResNet demonstrate that TNNs outperform the state-of-the-art low-rank methods on a wide range of backbone networks and datasets.
我们提出了一个张量神经网络(TNN)框架,将现有的低阶张量线性层扩展为高阶张量上的多线性运算。与现有网络相比,TNN有三个优点:第一,TNN自然适用于高阶数据,无需展平,从而保留其多维结构。第二,将预训练网络压缩为TNN会得到一个具有相似表达能力但参数更少的模型。最后,TNN解释了网络架构的先进紧凑设计,如瓶颈模块和交错分组卷积。为了学习TNN,我们使用一套新颖的广义张量代数推导其反向传播规则。通过反向传播,我们可以从零开始学习TNN,也可以使用知识蒸馏从预训练模型中学习。在VGG、ResNet和Wide-ResNet上的实验表明,TNN在广泛的骨干网络和数据集上优于当前最先进的低秩方法。