• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

CIPL:用于消除在线推荐中流行度偏差的反事实交互策略学习

CIPL: Counterfactual Interactive Policy Learning to Eliminate Popularity Bias for Online Recommendation.

作者信息

Zheng Yongsen, Qin Jinghui, Wei Pengxu, Chen Ziliang, Lin Liang

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Dec;35(12):17123-17136. doi: 10.1109/TNNLS.2023.3299929. Epub 2024 Dec 2.

DOI:10.1109/TNNLS.2023.3299929
PMID:37585330
Abstract

Popularity bias, as a long-standing problem in recommender systems (RSs), has been fully considered and explored for offline recommendation systems in most existing relevant researches, but very few studies have paid attention to eliminate such bias in online interactive recommendation scenarios. Bias amplification will become increasingly serious over time due to the existence of feedback loop between the user and the interactive system. However, existing methods have only investigated the causal relations among different factors statically without considering temporal dependencies inherent in the online interactive recommendation system, making them difficult to be adapted to online settings. To address these problems, we propose a novel counterfactual interactive policy learning (CIPL) method to eliminate popularity bias for online recommendation. It first scrutinizes the causal relations in the interactive recommender models and formulates a novel temporal causal graph (TCG) to guide the training and counterfactual inference of the causal interactive recommendation system. Concretely, TCG is used to estimate the causal relations of item popularity on prediction score when the user interacts with the system at each time during model training. Besides, it is also used to remove the negative effect of popularity bias in the test stage. To train the causal interactive recommendation system, we formulated our CIPL by the actor-critic framework with an online interactive environment simulator. We conduct extensive experiments on three public benchmarks and the experimental results demonstrate that our proposed method can achieve the new state-of-the-art performance.

摘要

流行度偏差作为推荐系统(RS)中一个长期存在的问题,在大多数现有相关研究中已针对离线推荐系统进行了充分考虑和探索,但很少有研究关注在在线交互推荐场景中消除这种偏差。由于用户与交互系统之间存在反馈回路,随着时间的推移,偏差放大将变得越来越严重。然而,现有方法仅静态地研究了不同因素之间的因果关系,而没有考虑在线交互推荐系统中固有的时间依赖性,这使得它们难以适用于在线环境。为了解决这些问题,我们提出了一种新颖的反事实交互策略学习(CIPL)方法来消除在线推荐中的流行度偏差。它首先仔细研究交互推荐模型中的因果关系,并构建一个新颖的时间因果图(TCG)来指导因果交互推荐系统的训练和反事实推理。具体而言,在模型训练期间,当用户每次与系统交互时,TCG用于估计项目流行度对预测分数的因果关系。此外,它还用于在测试阶段消除流行度偏差的负面影响。为了训练因果交互推荐系统,我们通过带有在线交互环境模拟器的演员-评论家框架来制定我们 的CIPL。我们在三个公共基准上进行了广泛的实验,实验结果表明我们提出的方法可以实现新的最优性能。

相似文献

1
CIPL: Counterfactual Interactive Policy Learning to Eliminate Popularity Bias for Online Recommendation.CIPL:用于消除在线推荐中流行度偏差的反事实交互策略学习
IEEE Trans Neural Netw Learn Syst. 2024 Dec;35(12):17123-17136. doi: 10.1109/TNNLS.2023.3299929. Epub 2024 Dec 2.
2
Plug-and-Play Model-Agnostic Counterfactual Policy Synthesis for Deep Reinforcement Learning-Based Recommendation.用于基于深度强化学习的推荐的即插即用模型无关反事实策略合成
IEEE Trans Neural Netw Learn Syst. 2025 Jan;36(1):1044-1055. doi: 10.1109/TNNLS.2023.3329808. Epub 2025 Jan 7.
3
Mitigating Confounding Bias in Practical Recommender Systems With Partially Inaccessible Exposure Status.在部分暴露状态不可获取的实际推荐系统中减轻混杂偏差
IEEE Trans Pattern Anal Mach Intell. 2024 Feb;46(2):957-974. doi: 10.1109/TPAMI.2023.3327411. Epub 2024 Jan 8.
4
Fairness-aware recommendation with meta learning.基于元学习的公平感知推荐
Sci Rep. 2024 May 2;14(1):10125. doi: 10.1038/s41598-024-60808-x.
5
Time-Aware Explainable Recommendation via Updating Enabled Online Prediction.通过启用更新的在线预测实现时间感知可解释推荐
Entropy (Basel). 2022 Nov 11;24(11):1639. doi: 10.3390/e24111639.
6
The Influence of Herd Mentality on Rating Bias and Popularity Bias: A Bi-Process Debiasing Recommendation Model Based on Matrix Factorization.群体心理对评分偏差和流行度偏差的影响:一种基于矩阵分解的双过程去偏推荐模型
Behav Sci (Basel). 2023 Jan 10;13(1):63. doi: 10.3390/bs13010063.
7
A survey on causal inference for recommendation.关于推荐的因果推断的一项调查。
Innovation (Camb). 2024 Feb 8;5(2):100590. doi: 10.1016/j.xinn.2024.100590. eCollection 2024 Mar 4.
8
MMAgentRec, a personalized multi-modal recommendation agent with large language model.MMAgentRec,一个带有大语言模型的个性化多模态推荐代理。
Sci Rep. 2025 Apr 8;15(1):12062. doi: 10.1038/s41598-025-96458-w.
9
Adaptive self-supervised learning for sequential recommendation.自适应自监督学习在序列推荐中的应用。
Neural Netw. 2024 Nov;179:106570. doi: 10.1016/j.neunet.2024.106570. Epub 2024 Jul 24.
10
Dual-Tower Counterfactual Session-Aware Recommender System.双塔反事实会话感知推荐系统
Entropy (Basel). 2024 Jun 14;26(6):516. doi: 10.3390/e26060516.