Liu Zirui, Song Qingquan, Li Li, Choi Soo-Hyun, Chen Rui, Hu Xia
Computer Science Department, Rice University, Houston, TX, United States.
Linkedin, Sunnyvale, CA, United States.
Front Big Data. 2023 Jun 15;6:1195742. doi: 10.3389/fdata.2023.1195742. eCollection 2023.
Embedding is widely used in recommendation models to learn feature representations. However, the traditional embedding technique that assigns a fixed size to all categorical features may be suboptimal due to the following reasons. In recommendation domain, the majority of categorical features' embeddings can be trained with less capacity without impacting model performance, thereby storing embeddings with equal length may incur unnecessary memory usage. Existing work that tries to allocate customized sizes for each feature usually either simply scales the embedding size with feature's popularity or formulates this size allocation problem as an architecture selection problem. Unfortunately, most of these methods either have large performance drop or incur significant extra time cost for searching proper embedding sizes. In this article, instead of formulating the size allocation problem as an architecture selection problem, we approach the problem from a pruning perspective and propose runing-based ulti-size mbedding (PME) framework. During the search phase, we prune the dimensions that have the least impact on model performance in the embedding to reduce its capacity. Then, we show that the customized size of each token can be obtained by transferring the capacity of its pruned embedding with significant less search cost. Experimental results validate that PME can efficiently find proper sizes and hence achieve strong performance while significantly reducing the number of parameters in the embedding layer.
嵌入在推荐模型中被广泛用于学习特征表示。然而,由于以下原因,为所有分类特征分配固定大小的传统嵌入技术可能不是最优的。在推荐领域,大多数分类特征的嵌入可以在不影响模型性能的情况下用较小的容量进行训练,因此存储等长的嵌入可能会产生不必要的内存使用。现有的尝试为每个特征分配定制大小的工作通常要么简单地根据特征的流行程度缩放嵌入大小,要么将这个大小分配问题表述为一个架构选择问题。不幸的是,这些方法大多要么有较大的性能下降,要么在搜索合适的嵌入大小时会产生显著的额外时间成本。在本文中,我们不是将大小分配问题表述为一个架构选择问题,而是从剪枝的角度来处理这个问题,并提出了基于运行的多大小嵌入(PME)框架。在搜索阶段,我们在嵌入中修剪对模型性能影响最小的维度以降低其容量。然后,我们表明每个令牌的定制大小可以通过转移其修剪后的嵌入的容量来获得,且搜索成本显著降低。实验结果验证了PME可以有效地找到合适的大小,从而在显著减少嵌入层参数数量的同时实现强大的性能。