Luo Gen, Zhou Yiyi, Sun Xiaoshuai, Wang Yan, Cao Liujuan, Wu Yongjian, Huang Feiyue, Ji Rongrong
IEEE Trans Image Process. 2022;31:3386-3398. doi: 10.1109/TIP.2021.3139234. Epub 2022 May 11.
Despite the exciting performance, Transformer is criticized for its excessive parameters and computation cost. However, compressing Transformer remains as an open problem due to its internal complexity of the layer designs, i.e., Multi-Head Attention (MHA) and Feed-Forward Network (FFN). To address this issue, we introduce Group-wise Transformation towards a universal yet lightweight Transformer for vision-and-language tasks, termed as LW-Transformer. LW-Transformer applies Group-wise Transformation to reduce both the parameters and computations of Transformer, while also preserving its two main properties, i.e., the efficient attention modeling on diverse subspaces of MHA, and the expanding-scaling feature transformation of FFN. We apply LW-Transformer to a set of Transformer-based networks, and quantitatively measure them on three vision-and-language tasks and six benchmark datasets. Experimental results show that while saving a large number of parameters and computations, LW-Transformer achieves very competitive performance against the original Transformer networks for vision-and-language tasks. To examine the generalization ability, we apply LW-Transformer to the task of image classification, and build its network based on a recently proposed image Transformer called Swin-Transformer, where the effectiveness can be also confirmed.
尽管Transformer有着令人兴奋的性能,但因其参数过多和计算成本过高而受到批评。然而,由于其层设计(即多头注意力机制(MHA)和前馈网络(FFN))的内部复杂性,压缩Transformer仍然是一个开放问题。为了解决这个问题,我们针对视觉和语言任务引入了一种通用且轻量级的Transformer——分组变换,称为LW-Transformer。LW-Transformer应用分组变换来减少Transformer的参数和计算量,同时还保留其两个主要特性,即MHA在不同子空间上的高效注意力建模以及FFN的扩展缩放特征变换。我们将LW-Transformer应用于一组基于Transformer的网络,并在三个视觉和语言任务以及六个基准数据集上对它们进行定量评估。实验结果表明,LW-Transformer在节省大量参数和计算量的同时,在视觉和语言任务上相对于原始Transformer网络实现了极具竞争力的性能。为了检验其泛化能力,我们将LW-Transformer应用于图像分类任务,并基于最近提出的一种名为Swin-Transformer的图像Transformer构建其网络,其有效性也得到了证实。