• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Multi-Source Conformal Inference Under Distribution Shift.分布偏移下的多源共形推理
Proc Mach Learn Res. 2024 Jul;235:31344-31382.
2
Doubly robust calibration of prediction sets under covariate shift.协变量偏移下预测集的双重稳健校准
J R Stat Soc Series B Stat Methodol. 2024 Mar 4;86(4):943-965. doi: 10.1093/jrsssb/qkae009. eCollection 2024 Sep.
3
Uncertainty Quantification in Epigenetic Clocks via Conformalized Quantile Regression.通过共形分位数回归进行表观遗传时钟中的不确定性量化
Genet Epidemiol. 2025 Jun;49(4):e70008. doi: 10.1002/gepi.70008.
4
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
5
Uncertainty quantification in epigenetic clocks via conformalized quantile regression.通过共形分位数回归进行表观遗传时钟中的不确定性量化。
medRxiv. 2025 Feb 11:2024.09.06.24313192. doi: 10.1101/2024.09.06.24313192.
6
Prediction sets adaptive to unknown covariate shift.适应未知协变量转移的预测集
J R Stat Soc Series B Stat Methodol. 2023 Jul 17;85(5):1680-1705. doi: 10.1093/jrsssb/qkad069. eCollection 2023 Nov.
7
Probabilistic prediction of arrivals and hospitalizations in emergency departments in Île-de-France.法兰西岛大区急诊科就诊及住院情况的概率预测。
Int J Med Inform. 2025 Mar;195:105728. doi: 10.1016/j.ijmedinf.2024.105728. Epub 2024 Dec 4.
8
Statistical Inference for Maximin Effects: Identifying Stable Associations across Multiple Studies.最大最小效应的统计推断:识别多项研究中的稳定关联。
J Am Stat Assoc. 2024;119(547):1968-1984. doi: 10.1080/01621459.2023.2233162. Epub 2023 Aug 4.
9
Collaborative double robust targeted maximum likelihood estimation.协作双稳健靶向最大似然估计
Int J Biostat. 2010 May 17;6(1):Article 17. doi: 10.2202/1557-4679.1181.
10
Privacy-protecting estimation of adjusted risk ratios using modified Poisson regression in multi-center studies.利用改良泊松回归在多中心研究中进行调整风险比的隐私保护估计。
BMC Med Res Methodol. 2019 Dec 5;19(1):228. doi: 10.1186/s12874-019-0878-6.

引用本文的文献

1
Out of distribution learning in bioinformatics: advancements and challenges.生物信息学中的分布外学习:进展与挑战
Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf294.
2
Adverse Outcome Pathway and Machine Learning to Predict Drug Induced Seizure Liability.不良结局途径与机器学习预测药物诱发癫痫的可能性
ACS Chem Neurosci. 2025 Jun 4;16(11):2085-2099. doi: 10.1021/acschemneuro.5c00177. Epub 2025 May 14.
3
Assessing racial disparities in healthcare expenditure using generalized propensity score weighting.使用广义倾向得分加权法评估医疗保健支出中的种族差异。
BMC Med Res Methodol. 2025 Mar 7;25(1):64. doi: 10.1186/s12874-025-02508-2.
4
When does adjusting covariate under randomization help? A comparative study on current practices.随机化时调整协变量有何帮助?对现行实践的比较研究。
BMC Med Res Methodol. 2024 Oct 26;24(1):250. doi: 10.1186/s12874-024-02375-3.

本文引用的文献

1
TARGETING UNDERREPRESENTED POPULATIONS IN PRECISION MEDICINE: A FEDERATED TRANSFER LEARNING APPROACH.精准医学中针对代表性不足人群:一种联邦迁移学习方法。
Ann Appl Stat. 2023 Dec;17(4):2970-2992. doi: 10.1214/23-AOAS1747. Epub 2023 Oct 30.
2
Doubly robust calibration of prediction sets under covariate shift.协变量偏移下预测集的双重稳健校准
J R Stat Soc Series B Stat Methodol. 2024 Mar 4;86(4):943-965. doi: 10.1093/jrsssb/qkae009. eCollection 2024 Sep.
3
Federated causal inference in heterogeneous observational data.基于异质观测数据的联邦因果推断。
Stat Med. 2023 Oct 30;42(24):4418-4439. doi: 10.1002/sim.9868. Epub 2023 Aug 8.
4
Sensitivity analysis of individual treatment effects: A robust conformal inference approach.个体治疗效果的敏感性分析:一种稳健的保形推理方法。
Proc Natl Acad Sci U S A. 2023 Feb 7;120(6):e2214889120. doi: 10.1073/pnas.2214889120. Epub 2023 Feb 2.
5
A fast score test for generalized mixture models.广义混合模型的快速得分检验。
Biometrics. 2020 Sep;76(3):811-820. doi: 10.1111/biom.13204. Epub 2019 Dec 31.
6
Learning from electronic health records across multiple sites: A communication-efficient and privacy-preserving distributed algorithm.从多个站点的电子健康记录中学习:一种通信高效且隐私保护的分布式算法。
J Am Med Inform Assoc. 2020 Mar 1;27(3):376-385. doi: 10.1093/jamia/ocz199.
7
Ten Years of Data Verification: The Society of Thoracic Surgeons Congenital Heart Surgery Database Audits.十年数据验证:胸外科医师协会先天性心脏病手术数据库审计
World J Pediatr Congenit Heart Surg. 2019 Jul;10(4):454-463. doi: 10.1177/2150135119845256.
8
Congenital Heart Surgery Case Mix Across North American Centers and Impact on Performance Assessment.北美各中心先天性心脏手术病例组合及其对绩效评估的影响。
Ann Thorac Surg. 2016 Nov;102(5):1580-1587. doi: 10.1016/j.athoracsur.2016.04.034. Epub 2016 Jul 22.
9
Risk factors for hospital morbidity and mortality after the Norwood procedure: A report from the Pediatric Heart Network Single Ventricle Reconstruction trial.经室间隔完整肺动脉闭锁根治术后患儿院内发病率和死亡率的危险因素:来自儿科心脏网络单心室重建试验的报告。
J Thorac Cardiovasc Surg. 2012 Oct;144(4):882-95. doi: 10.1016/j.jtcvs.2012.05.019. Epub 2012 Jun 15.
10
Super learner.超级学习者。
Stat Appl Genet Mol Biol. 2007;6:Article25. doi: 10.2202/1544-6115.1309. Epub 2007 Sep 16.

分布偏移下的多源共形推理

Multi-Source Conformal Inference Under Distribution Shift.

作者信息

Liu Yi, Levis Alexander W, Normand Sharon-Lise, Han Larry

机构信息

North Carolina State University, Department of Statistics, Raleigh, NC, USA.

Carnegie Mellon University, Department of Statistics, Pittsburgh, PA, USA.

出版信息

Proc Mach Learn Res. 2024 Jul;235:31344-31382.

PMID:39193374
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11345809/
Abstract

Recent years have experienced increasing utilization of complex machine learning models across multiple sources of data to inform more generalizable decision-making. However, distribution shifts across data sources and privacy concerns related to sharing individual-level data, coupled with a lack of uncertainty quantification from machine learning predictions, make it challenging to achieve valid inferences in multi-source environments. In this paper, we consider the problem of obtaining distribution-free prediction intervals for a target population, leveraging multiple potentially biased data sources. We derive the efficient influence functions for the quantiles of unobserved outcomes in the target and source populations, and show that one can incorporate machine learning prediction algorithms in the estimation of nuisance functions while still achieving parametric rates of convergence to nominal coverage probabilities. Moreover, when conditional outcome invariance is violated, we propose a data-adaptive strategy to upweight informative data sources for efficiency gain and downweight non-informative data sources for bias reduction. We highlight the robustness and efficiency of our proposals for a variety of conformal scores and data-generating mechanisms via extensive synthetic experiments. Hospital length of stay prediction intervals for pediatric patients undergoing a high-risk cardiac surgical procedure between 2016-2022 in the U.S. illustrate the utility of our methodology.

摘要

近年来,复杂的机器学习模型在多个数据源中的应用越来越广泛,以支持更具普遍性的决策。然而,数据源之间的分布变化以及与共享个体层面数据相关的隐私问题,再加上机器学习预测缺乏不确定性量化,使得在多源环境中进行有效的推断具有挑战性。在本文中,我们考虑利用多个潜在有偏差的数据源为目标人群获得无分布预测区间的问题。我们推导了目标人群和源人群中未观察到的结果分位数的有效影响函数,并表明可以在干扰函数的估计中纳入机器学习预测算法,同时仍能达到名义覆盖概率的参数收敛速率。此外,当条件结果不变性被违反时,我们提出一种数据自适应策略,对信息丰富的数据源进行加权以提高效率,对信息不足的数据源进行降权以减少偏差。我们通过广泛的综合实验突出了我们的提议对于各种共形分数和数据生成机制的稳健性和效率。2016 - 2022年美国接受高风险心脏手术的儿科患者的住院时间预测区间说明了我们方法的实用性。