Thomas Anna T, Gu Albert, Dao Tri, Rudra Atri, Ré Christopher
Department of Computer Science, Stanford University.
Department of Computer Science and Engineering, University at Buffalo, SUNY.
Adv Neural Inf Process Syst. 2018 Dec;2018:9052-9060.
The low displacement rank (LDR) framework for structured matrices represents a matrix through two displacement operators and a low-rank residual. Existing use of LDR matrices in deep learning has applied fixed displacement operators encoding forms of shift invariance akin to convolutions. We introduce a rich class of LDR matrices with more general displacement operators, and explicitly learn over both the operators and the low-rank component. This class generalizes several previous constructions while preserving compression and efficient computation. We prove bounds on the VC dimension of multi-layer neural networks with structured weight matrices and show empirically that our compact parameterization can reduce the sample complexity of learning. When replacing weight layers in fully-connected, convolutional, and recurrent neural networks for image classification and language modeling tasks, our new classes exceed the accuracy of existing compression approaches, and on some tasks even outperform general unstructured layers while using more than 20X fewer parameters.
用于结构化矩阵的低位移秩(LDR)框架通过两个位移算子和一个低秩残差来表示矩阵。LDR矩阵在深度学习中的现有应用采用了固定的位移算子,这些算子编码了类似于卷积的平移不变形式。我们引入了一类丰富的具有更一般位移算子的LDR矩阵,并对算子和低秩分量进行显式学习。这类矩阵在保留压缩和高效计算的同时,推广了先前的几种构造。我们证明了具有结构化权重矩阵的多层神经网络的VC维数界,并通过实验表明,我们的紧凑参数化可以降低学习的样本复杂度。在用于图像分类和语言建模任务的全连接、卷积和循环神经网络中替换权重层时,我们的新类超过了现有压缩方法的准确率,并且在某些任务上,即使使用的参数减少了20倍以上,也能优于一般的非结构化层。