Mussay Ben, Feldman Dan, Zhou Samson, Braverman Vladimir, Osadchy Margarita
IEEE Trans Neural Netw Learn Syst. 2022 Dec;33(12):7829-7841. doi: 10.1109/TNNLS.2021.3088587. Epub 2022 Nov 30.
Model compression is crucial for the deployment of neural networks on devices with limited computational and memory resources. Many different methods show comparable accuracy of the compressed model and similar compression rates. However, the majority of the compression methods are based on heuristics and offer no worst case guarantees on the tradeoff between the compression rate and the approximation error for an arbitrarily new sample. We propose the first efficient structured pruning algorithm with a provable tradeoff between its compression rate and the approximation error for any future test sample. Our method is based on the coreset framework, and it approximates the output of a layer of neurons/filters by a coreset of neurons/filters in the previous layer and discards the rest. We apply this framework in a layer-by-layer fashion from the bottom to the top. Unlike previous works, our coreset is data-independent, meaning that it provably guarantees the accuracy of the function for any input [Formula: see text], including an adversarial one.
模型压缩对于在计算和内存资源有限的设备上部署神经网络至关重要。许多不同的方法在压缩模型的准确性和压缩率方面表现相当。然而,大多数压缩方法基于启发式算法,对于任意新样本在压缩率和近似误差之间的权衡没有提供最坏情况保证。我们提出了第一种有效的结构化剪枝算法,该算法在压缩率和任何未来测试样本的近似误差之间具有可证明的权衡。我们的方法基于协同集框架,它通过前一层中神经元/滤波器的协同集来近似一层神经元/滤波器的输出,并丢弃其余部分。我们从下到上逐层应用此框架。与先前的工作不同,我们的协同集与数据无关,这意味着它可证明地保证了对于任何输入[公式:见文本](包括对抗性输入)函数的准确性。