• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于策略泛化的效果不变机制。

Effect-Invariant Mechanisms for Policy Generalization.

作者信息

Saengkyongam Sorawit, Pfister Niklas, Klasnja Predrag, Murphy Susan, Peters Jonas

机构信息

Seminar for Statistics, ETH Zürich, Zürich, Switzerland.

Department of Mathematical Sciences, University of Copenhagena, Copenhagen, Denmark.

出版信息

J Mach Learn Res. 2024;25.

PMID:39082006
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11286230/
Abstract

Policy learning is an important component of many real-world learning systems. A major challenge in policy learning is how to adapt efficiently to unseen environments or tasks. Recently, it has been suggested to exploit invariant conditional distributions to learn models that generalize better to unseen environments. However, assuming invariance of entire conditional distributions (which we call full invariance) may be too strong of an assumption in practice. In this paper, we introduce a relaxation of full invariance called effect-invariance (e-invariance for short) and prove that it is sufficient, under suitable assumptions, for zero-shot policy generalization. We also discuss an extension that exploits e-invariance when we have a small sample from the test environment, enabling few-shot policy generalization. Our work does not assume an underlying causal graph or that the data are generated by a structural causal model; instead, we develop testing procedures to test e-invariance directly from data. We present empirical results using simulated data and a mobile health intervention dataset to demonstrate the effectiveness of our approach.

摘要

策略学习是许多现实世界学习系统的重要组成部分。策略学习中的一个主要挑战是如何有效地适应未知环境或任务。最近,有人建议利用不变条件分布来学习能更好地推广到未知环境的模型。然而,假设整个条件分布的不变性(我们称之为完全不变性)在实践中可能是一个过于强的假设。在本文中,我们引入了一种对完全不变性的松弛,称为效应不变性(简称为e不变性),并证明在适当假设下,它足以实现零样本策略泛化。我们还讨论了一种扩展,即在我们有来自测试环境的小样本时利用e不变性,实现少样本策略泛化。我们的工作不假设潜在的因果图,也不假设数据由结构因果模型生成;相反,我们开发了直接从数据测试e不变性的测试程序。我们使用模拟数据和移动健康干预数据集展示了实证结果,以证明我们方法的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7647/11286230/c0a588a4dbd4/nihms-1957387-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7647/11286230/a9e8b9122a49/nihms-1957387-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7647/11286230/ac843721a97d/nihms-1957387-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7647/11286230/f3f7e2650358/nihms-1957387-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7647/11286230/c0a588a4dbd4/nihms-1957387-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7647/11286230/a9e8b9122a49/nihms-1957387-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7647/11286230/ac843721a97d/nihms-1957387-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7647/11286230/f3f7e2650358/nihms-1957387-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7647/11286230/c0a588a4dbd4/nihms-1957387-f0004.jpg

相似文献

1
Effect-Invariant Mechanisms for Policy Generalization.用于策略泛化的效果不变机制。
J Mach Learn Res. 2024;25.
2
Invariant Policy Learning: A Causal Perspective.不变策略学习:因果视角。
IEEE Trans Pattern Anal Mach Intell. 2023 Jul;45(7):8606-8620. doi: 10.1109/TPAMI.2022.3232363. Epub 2023 Jun 5.
3
Contrastive-ACE: Domain Generalization Through Alignment of Causal Mechanisms.对比性因果效应估计:通过因果机制对齐实现领域泛化
IEEE Trans Image Process. 2023;32:235-250. doi: 10.1109/TIP.2022.3227457. Epub 2022 Dec 19.
4
How do humans want causes to combine their effects? The role of analytically-defined causal invariance for generalizable causal knowledge.人类希望原因如何组合它们的效果?可分析定义的因果不变性对可推广因果知识的作用。
Cognition. 2023 Jan;230:105303. doi: 10.1016/j.cognition.2022.105303. Epub 2022 Nov 15.
5
On the benefits of representation regularization in invariance based domain generalization.基于不变性的域泛化中表示正则化的益处
Mach Learn. 2022;111(3):895-915. doi: 10.1007/s10994-021-06080-w. Epub 2022 Jan 1.
6
Progressive Invariant Causal Feature Learning for Single Domain Generalization.
IEEE Trans Image Process. 2025;34:2694-2706. doi: 10.1109/TIP.2025.3563772. Epub 2025 May 6.
7
Analytic Causal Knowledge for Constructing Useable Empirical Causal Knowledge: Two Experiments on Pre-schoolers.分析因果知识对构建可用经验因果知识的作用:对学前儿童的两项实验。
Cogn Sci. 2022 May;46(5):e13137. doi: 10.1111/cogs.13137.
8
Generalizing Deep Learning for Medical Image Segmentation to Unseen Domains via Deep Stacked Transformation.通过深度堆叠变换将深度学习用于医学图像分割推广到未见领域。
IEEE Trans Med Imaging. 2020 Jul;39(7):2531-2540. doi: 10.1109/TMI.2020.2973595. Epub 2020 Feb 12.
9
Generative Mixup Networks for Zero-Shot Learning.用于零样本学习的生成式混合网络
IEEE Trans Neural Netw Learn Syst. 2025 Mar;36(3):4054-4065. doi: 10.1109/TNNLS.2022.3142181. Epub 2025 Feb 28.
10
Learning stable and predictive structures in kinetic systems.在动力学系统中学习稳定且可预测的结构。
Proc Natl Acad Sci U S A. 2019 Dec 17;116(51):25405-25411. doi: 10.1073/pnas.1905688116. Epub 2019 Nov 27.

本文引用的文献

1
Invariant Policy Learning: A Causal Perspective.不变策略学习:因果视角。
IEEE Trans Pattern Anal Mach Intell. 2023 Jul;45(7):8606-8620. doi: 10.1109/TPAMI.2022.3232363. Epub 2023 Jun 5.
2
Statistical Inference with M-Estimators on Adaptively Collected Data.基于自适应收集数据的M估计量的统计推断。
Adv Neural Inf Process Syst. 2021 Dec;34:7460-7471.
3
Personalized HeartSteps: A Reinforcement Learning Algorithm for Optimizing Physical Activity.个性化心脏运动计划:一种用于优化身体活动的强化学习算法
Proc ACM Interact Mob Wearable Ubiquitous Technol. 2020 Mar;4(1). doi: 10.1145/3381007.
4
A Causal Framework for Distribution Generalization.分布泛化的因果框架。
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):6614-6630. doi: 10.1109/TPAMI.2021.3094760. Epub 2022 Sep 14.
5
Confidence intervals for policy evaluation in adaptive experiments.自适应试验中政策评估的置信区间。
Proc Natl Acad Sci U S A. 2021 Apr 13;118(15). doi: 10.1073/pnas.2014602118.
6
Assessing Time-Varying Causal Effect Moderation in Mobile Health.评估移动健康中时变因果效应调节
J Am Stat Assoc. 2018;113(523):1112-1121. doi: 10.1080/01621459.2017.1305274. Epub 2017 Mar 29.
7
Efficacy of Contextually Tailored Suggestions for Physical Activity: A Micro-randomized Optimization Trial of HeartSteps.基于情境的体力活动建议的效果:HeartSteps 的微型随机优化试验。
Ann Behav Med. 2019 May 3;53(6):573-582. doi: 10.1093/abm/kay067.