Cyberspace Institute of Advanced Technology (CIAT), Guangzhou University, Guangzhou 510006, China.
Academy of Digital Human Technology, Shenzhen Tianyuan Dic Information Technology Co., Ltd, Shenzhen 518057, China.
Comput Intell Neurosci. 2022 Aug 29;2022:1775496. doi: 10.1155/2022/1775496. eCollection 2022.
The click-through rate (CTR) prediction task is used to estimate the probabilities of users clicking on recommended items, which are extremely important in recommender systems. Recently, the deep factorization machine (DeepFM) algorithm was proposed. The DeepFM algorithm incorporates a factorization machine (FM) to learn not only low-order features but also the interactions of higher-order features. However, DeepFM lacks user diversity representations and does not consider the text. In view of this, we propose a text-attention FM (TAFM) based on the DeepFM algorithm. First, the attention mechanism in the TAFM algorithm is used to address the diverse representations of users and goods and to mine the features that are most interesting to users. Second, the TAFM model can fully learn text features through its text component, text attention component, and N-gram text feature extraction component, which can fully explore potential user preferences and the diversity among user interests. In addition, the convolutional autoencoder in the TAFM can learn some higher-level features, and the higher-order feature mining process is more comprehensive. On the public dataset, the better performing models in the existing models are deep cross network (DCN), DeepFM, and product-based neural network (PNN), respectively, and the AUC score metrics of these models hover between 0.698 and 0.699. The AUC score of our design model is 0.730, which is at least 3% higher than that of the existing models. The accuracy metric of our model is at least 0.1 percentage points higher than that of existing models.
点击率 (CTR) 预测任务用于估计用户点击推荐项目的概率,这在推荐系统中极为重要。最近,提出了深度因子分解机 (DeepFM) 算法。DeepFM 算法将因子分解机 (FM) 纳入其中,不仅可以学习低阶特征,还可以学习高阶特征的交互作用。然而,DeepFM 缺乏用户多样性表示,并且不考虑文本。针对此问题,我们提出了一种基于 DeepFM 算法的文本注意力 FM(TAFM)。首先,TAFM 算法中的注意力机制用于解决用户和商品的多样性表示问题,并挖掘对用户最感兴趣的特征。其次,TAFM 模型可以通过其文本组件、文本注意力组件和 N 元组文本特征提取组件充分学习文本特征,从而充分挖掘潜在的用户偏好和用户兴趣之间的多样性。此外,TAFM 中的卷积自动编码器可以学习一些更高层次的特征,并且高阶特征挖掘过程更加全面。在公共数据集上,现有模型中性能表现更好的模型分别是深度交叉网络 (DCN)、DeepFM 和基于产品的神经网络 (PNN),这些模型的 AUC 得分指标在 0.698 到 0.699 之间。我们设计的模型的 AUC 得分至少比现有模型高 3%。我们模型的准确率至少比现有模型高 0.1 个百分点。