Wang Xiaowei, Dong Hongbin
College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China.
Entropy (Basel). 2023 Feb 23;25(3):406. doi: 10.3390/e25030406.
Click-through rate (CTR) prediction is a research point for measuring recommendation systems and calculating AD traffic. Existing studies have proved that deep learning performs very well in prediction tasks, but most of the existing studies are based on deterministic models, and there is a big gap in capturing uncertainty. Modeling uncertainty is a major challenge when using machine learning solutions to solve real-world problems in various domains. In order to quantify the uncertainty of the model and achieve accurate and reliable prediction results. This paper designs a CTR prediction framework combining feature selection and feature interaction. In this framework, a CTR prediction model based on Bayesian deep learning is proposed to quantify the uncertainty in the prediction model. On the squeeze network and DNN parallel prediction model framework, the approximate posterior parameter distribution of the model is obtained using the Monte Carlo dropout, and obtains the integrated prediction results. Epistemic and aleatoric uncertainty are defined and adopt information entropy to calculate the sum of the two kinds of uncertainties. Epistemic uncertainty could be measured by mutual information. Experimental results show that the model proposed is superior to other models in terms of prediction performance and has the ability to quantify uncertainty.
点击率(CTR)预测是衡量推荐系统和计算广告流量的一个研究点。现有研究表明深度学习在预测任务中表现出色,但大多数现有研究基于确定性模型,在捕捉不确定性方面存在较大差距。在使用机器学习解决方案解决各领域实际问题时,对不确定性进行建模是一项重大挑战。为了量化模型的不确定性并获得准确可靠的预测结果,本文设计了一个结合特征选择和特征交互的CTR预测框架。在此框架下,提出了一种基于贝叶斯深度学习的CTR预测模型来量化预测模型中的不确定性。在挤压网络和DNN并行预测模型框架上,使用蒙特卡洛随机失活获得模型的近似后验参数分布,并得到集成预测结果。定义了认知不确定性和随机不确定性,并采用信息熵来计算这两种不确定性的总和。认知不确定性可以通过互信息来度量。实验结果表明,所提出的模型在预测性能方面优于其他模型,并且具有量化不确定性的能力。