• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于电子健康记录的比较效果研究中信息性缺失数据的双重抽样

Double Sampling for Informatively Missing Data in Electronic Health Record-Based Comparative Effectiveness Research.

作者信息

Levis Alexander W, Mukherjee Rajarshi, Wang Rui, Fischer Heidi, Haneuse Sebastien

机构信息

Department of Statistics & Data Science, Carnegie Mellon University, Pittsburgh, Pennsylvania.

Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, Massachusetts.

出版信息

Stat Med. 2024 Dec 30;43(30):6086-6098. doi: 10.1002/sim.10298. Epub 2024 Dec 5.

DOI:10.1002/sim.10298
PMID:39638313
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11639654/
Abstract

Missing data arise in most applied settings and are ubiquitous in electronic health records (EHR). When data are missing not at random (MNAR) with respect to measured covariates, sensitivity analyses are often considered. These solutions, however, are often unsatisfying in that they are not guaranteed to yield actionable conclusions. Motivated by an EHR-based study of long-term outcomes following bariatric surgery, we consider the use of double sampling as a means to mitigate MNAR outcome data when the statistical goals are estimation and inference regarding causal effects. We describe assumptions that are sufficient for the identification of the joint distribution of confounders, treatment, and outcome under this design. Additionally, we derive efficient and robust estimators of the average causal treatment effect under a nonparametric model and under a model assuming outcomes were, in fact, initially missing at random (MAR). We compare these in simulations to an approach that adaptively estimates based on evidence of violation of the MAR assumption. Finally, we also show that the proposed double sampling design can be extended to handle arbitrary coarsening mechanisms, and derive nonparametric efficient estimators of any smooth full data functional.

摘要

缺失数据在大多数实际应用场景中都会出现,并且在电子健康记录(EHR)中普遍存在。当数据相对于测量的协变量并非随机缺失(MNAR)时,通常会考虑进行敏感性分析。然而,这些方法往往并不令人满意,因为它们不一定能得出可采取行动的结论。受一项基于电子健康记录的减肥手术长期结果研究的启发,当统计目标是对因果效应进行估计和推断时,我们考虑使用双重抽样作为减轻MNAR结果数据的一种方法。我们描述了在此设计下足以识别混杂因素、治疗和结果的联合分布的假设。此外,我们在非参数模型以及假设结果实际上最初是随机缺失(MAR)的模型下,推导了平均因果治疗效应的有效且稳健的估计量。我们在模拟中将这些估计量与基于违反MAR假设的证据进行自适应估计的方法进行比较。最后,我们还表明,所提出的双重抽样设计可以扩展以处理任意的粗化机制,并推导任何平滑全数据函数的非参数有效估计量。

相似文献

1
Double Sampling for Informatively Missing Data in Electronic Health Record-Based Comparative Effectiveness Research.基于电子健康记录的比较效果研究中信息性缺失数据的双重抽样
Stat Med. 2024 Dec 30;43(30):6086-6098. doi: 10.1002/sim.10298. Epub 2024 Dec 5.
2
Estimating weighted quantile treatment effects with missing outcome data by double sampling.通过双重抽样对缺失结果数据估计加权分位数治疗效果。
Biometrics. 2025 Apr 2;81(2). doi: 10.1093/biomtc/ujaf038.
3
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
4
Double Robust Efficient Estimators of Longitudinal Treatment Effects: Comparative Performance in Simulations and a Case Study.纵向治疗效果的双重稳健有效估计量:模拟中的比较性能及一个案例研究
Int J Biostat. 2019 Feb 26;15(2):/j/ijb.2019.15.issue-2/ijb-2017-0054/ijb-2017-0054.xml. doi: 10.1515/ijb-2017-0054.
5
Efficient Nonparametric Causal Inference with Missing Exposure Information.高效的非参数因果推断方法:缺失暴露信息处理。
Int J Biostat. 2020 Mar 14;16(1):ijb-2019-0087. doi: 10.1515/ijb-2019-0087.
6
Bayesian causal inference for observational studies with missingness in covariates and outcomes.贝叶斯因果推断在协变量和结局缺失的观察性研究中的应用。
Biometrics. 2023 Dec;79(4):3624-3636. doi: 10.1111/biom.13918. Epub 2023 Aug 8.
7
Heckman imputation models for binary or continuous MNAR outcomes and MAR predictors.Heckman 插补模型用于二分类或连续 MNAR 结局和 MAR 预测因子。
BMC Med Res Methodol. 2018 Aug 31;18(1):90. doi: 10.1186/s12874-018-0547-1.
8
Doubly robust estimation and sensitivity analysis for marginal structural quantile models.双重稳健估计和边际结构分位数模型的敏感性分析。
Biometrics. 2024 Mar 27;80(2). doi: 10.1093/biomtc/ujae045.
9
A hybrid return to baseline imputation method to incorporate MAR and MNAR dropout missingness.一种混合的回归到基线填补方法,用于纳入 MAR 和 MNAR 缺失。
Contemp Clin Trials. 2022 Sep;120:106859. doi: 10.1016/j.cct.2022.106859. Epub 2022 Jul 21.
10
Targeted learning in real-world comparative effectiveness research with time-varying interventions.在具有随时间变化干预措施的真实世界比较效果研究中的靶向学习
Stat Med. 2014 Jun 30;33(14):2480-520. doi: 10.1002/sim.6099. Epub 2014 Feb 17.

引用本文的文献

1
Estimating weighted quantile treatment effects with missing outcome data by double sampling.通过双重抽样对缺失结果数据估计加权分位数治疗效果。
Biometrics. 2025 Apr 2;81(2). doi: 10.1093/biomtc/ujaf038.

本文引用的文献

1
Semiparametric Inference for Nonmonotone Missing-Not-at-Random Data: The No Self-Censoring Model.非单调缺失非随机数据的半参数推断:无自删失模型
J Am Stat Assoc. 2022;117(539):1415-1423. doi: 10.1080/01621459.2020.1862669. Epub 2021 Feb 3.
2
Two-Phase Sampling Designs for Data Validation in Settings with Covariate Measurement Error and Continuous Outcome.具有协变量测量误差和连续结果的情况下用于数据验证的两阶段抽样设计
J R Stat Soc Ser A Stat Soc. 2021 Oct;184(4):1368-1389. doi: 10.1111/rssa.12689. Epub 2021 Apr 15.
3
Reflection on modern methods: combining weights for confounding and missing data.
对现代方法的反思:结合混杂因素和缺失数据的权重。
Int J Epidemiol. 2022 May 9;51(2):679-684. doi: 10.1093/ije/dyab205.
4
Investigating Bias from Missing Data in an Electronic Health Records-Based Study of Weight Loss After Bariatric Surgery.基于电子健康记录的减重手术减肥效果研究中缺失数据导致的偏倚分析。
Obes Surg. 2021 May;31(5):2125-2135. doi: 10.1007/s11695-021-05226-y. Epub 2021 Jan 19.
5
Semiparametric Estimation with Data Missing Not at Random Using an Instrumental Variable.使用工具变量对非随机缺失数据进行半参数估计。
Stat Sin. 2018 Oct;28(4):1965-1983. doi: 10.5705/ss.202016.0324.
6
Towards augmenting structured EHR data: a comparison of manual chart review and patient self-report.迈向增强结构化电子健康记录数据:手工病历审查与患者自我报告的比较
AMIA Annu Symp Proc. 2020 Mar 4;2019:903-912. eCollection 2019.
7
Weight Outcomes of Sleeve Gastrectomy and Gastric Bypass Compared to Nonsurgical Treatment.袖状胃切除术和胃旁路术与非手术治疗的体重结局比较。
Ann Surg. 2021 Dec 1;274(6):e1269-e1276. doi: 10.1097/SLA.0000000000003826.
8
A General Framework for Considering Selection Bias in EHR-Based Studies: What Data Are Observed and Why?基于电子健康记录的研究中考虑选择偏倚的通用框架:观察到了哪些数据以及原因是什么?
EGEMS (Wash DC). 2016 Aug 31;4(1):1203. doi: 10.13063/2327-9214.1203. eCollection 2016.
9
On varieties of doubly robust estimators under missingness not at random with a shadow variable.关于具有影子变量的非随机缺失情况下的双稳健估计量的各种形式。
Biometrika. 2016 Jun;103(2):475-482. doi: 10.1093/biomet/asw016. Epub 2016 May 10.
10
Learning About Missing Data Mechanisms in Electronic Health Records-based Research: A Survey-based Approach.了解基于电子健康记录的研究中的缺失数据机制:一种基于调查的方法。
Epidemiology. 2016 Jan;27(1):82-90. doi: 10.1097/EDE.0000000000000393.