• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

当协变量为非随机缺失时提高完全病例分析的效率。

Improving upon the efficiency of complete case analysis when covariates are MNAR.

作者信息

Bartlett Jonathan W, Carpenter James R, Tilling Kate, Vansteelandt Stijn

机构信息

Centre for Statistical Methodology, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK

Centre for Statistical Methodology, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK and MRC Clinical Trial Trials Unit, Kingsway, London WC2B 6NH, UK.

出版信息

Biostatistics. 2014 Oct;15(4):719-30. doi: 10.1093/biostatistics/kxu023. Epub 2014 Jun 6.

DOI:10.1093/biostatistics/kxu023
PMID:24907708
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4173105/
Abstract

Missing values in covariates of regression models are a pervasive problem in empirical research. Popular approaches for analyzing partially observed datasets include complete case analysis (CCA), multiple imputation (MI), and inverse probability weighting (IPW). In the case of missing covariate values, these methods (as typically implemented) are valid under different missingness assumptions. In particular, CCA is valid under missing not at random (MNAR) mechanisms in which missingness in a covariate depends on the value of that covariate, but is conditionally independent of outcome. In this paper, we argue that in some settings such an assumption is more plausible than the missing at random assumption underpinning most implementations of MI and IPW. When the former assumption holds, although CCA gives consistent estimates, it does not make use of all observed information. We therefore propose an augmented CCA approach which makes the same conditional independence assumption for missingness as CCA, but which improves efficiency through specification of an additional model for the probability of missingness, given the fully observed variables. The new method is evaluated using simulations and illustrated through application to data on reported alcohol consumption and blood pressure from the US National Health and Nutrition Examination Survey, in which data are likely MNAR independent of outcome.

摘要

回归模型协变量中的缺失值是实证研究中普遍存在的问题。分析部分观测数据集的常用方法包括完全病例分析(CCA)、多重填补(MI)和逆概率加权(IPW)。在协变量值缺失的情况下,这些方法(通常的实现方式)在不同的缺失性假设下是有效的。特别是,CCA在非随机缺失(MNAR)机制下是有效的,在这种机制中,协变量的缺失取决于该协变量的值,但与结果有条件地独立。在本文中,我们认为在某些情况下,这样的假设比支撑MI和IPW大多数实现方式的随机缺失假设更合理。当前者假设成立时,虽然CCA给出了一致的估计,但它没有利用所有观测到的信息。因此,我们提出了一种增强的CCA方法,该方法对缺失性做出与CCA相同的条件独立性假设,但通过为给定完全观测变量的缺失概率指定一个额外的模型来提高效率。使用模拟对新方法进行了评估,并通过应用于美国国家健康和营养检查调查中报告的酒精消费和血压数据进行了说明,在该数据中,数据可能是与结果无关的MNAR。

相似文献

1
Improving upon the efficiency of complete case analysis when covariates are MNAR.当协变量为非随机缺失时提高完全病例分析的效率。
Biostatistics. 2014 Oct;15(4):719-30. doi: 10.1093/biostatistics/kxu023. Epub 2014 Jun 6.
2
Approaches for missing covariate data in logistic regression with MNAR sensitivity analyses.具有 MAR 敏感性分析的逻辑回归中缺失协变量数据的处理方法。
Biom J. 2020 Jul;62(4):1025-1037. doi: 10.1002/bimj.201900117. Epub 2020 Jan 20.
3
Accounting for bias due to outcome data missing not at random: comparison and illustration of two approaches to probabilistic bias analysis: a simulation study.考虑由于非随机缺失结局数据导致的偏倚:两种概率性偏倚分析方法的比较和说明:一项模拟研究。
BMC Med Res Methodol. 2024 Nov 13;24(1):278. doi: 10.1186/s12874-024-02382-4.
4
Improving estimation efficiency for regression with MNAR covariates.提高具有 MAR 协变量的回归估计效率。
Biometrics. 2020 Mar;76(1):270-280. doi: 10.1111/biom.13131. Epub 2019 Nov 7.
5
Multiple imputation using auxiliary imputation variables that only predict missingness can increase bias due to data missing not at random.仅使用辅助预测缺失变量的多重插补可能会因数据缺失而增加偏差。
BMC Med Res Methodol. 2024 Oct 7;24(1):231. doi: 10.1186/s12874-024-02353-9.
6
Comparison of methods to handle missing values in a continuous index test in a diagnostic accuracy study - a simulation study.诊断准确性研究中连续指标试验中处理缺失值方法的比较——一项模拟研究
BMC Med Res Methodol. 2025 May 27;25(1):147. doi: 10.1186/s12874-025-02594-2.
7
Evaluation of multiple imputation approaches for handling missing covariate information in a case-cohort study with a binary outcome.评价在二分类结局病例-对照研究中采用多种插补方法处理协变量缺失信息的效果。
BMC Med Res Methodol. 2022 Apr 3;22(1):87. doi: 10.1186/s12874-021-01495-4.
8
A Bayesian Latent Variable Selection Model for Nonignorable Missingness.贝叶斯潜在变量选择模型在不可忽略缺失数据中的应用
Multivariate Behav Res. 2022 Mar-May;57(2-3):478-512. doi: 10.1080/00273171.2021.1874259. Epub 2021 Feb 2.
9
Evaluation of predictive model performance of an existing model in the presence of missing data.评估存在缺失数据时现有模型的预测模型性能。
Stat Med. 2021 Jul 10;40(15):3477-3498. doi: 10.1002/sim.8978. Epub 2021 Apr 11.
10
Heckman imputation models for binary or continuous MNAR outcomes and MAR predictors.Heckman 插补模型用于二分类或连续 MNAR 结局和 MAR 预测因子。
BMC Med Res Methodol. 2018 Aug 31;18(1):90. doi: 10.1186/s12874-018-0547-1.

引用本文的文献

1
Bias and Efficiency Comparison between Multiple Imputation and Available-Case Analysis for Missing Data in Longitudinal Models.纵向模型中缺失数据的多重填补与有效病例分析之间的偏差和效率比较
Stat Biosci. 2025 Jun 12. doi: 10.1007/s12561-025-09493-6.
2
Improving Data Integrity in Samples Obtained From Web-Based Recruitment: Protocol for the Development of a Novel System for Assessing Participant Authenticity in a Remote Longitudinal Cohort Study of Polysubstance Use.提高基于网络招募的样本的数据完整性:开发一种新型系统以评估多物质使用远程纵向队列研究中参与者真实性的方案。
JMIR Res Protoc. 2025 Aug 14;14:e69956. doi: 10.2196/69956.
3
The Longitudinal Association Between Chronic Back Pain and Cognitive Decline in Older Adults With Mediation Analysis: An Analysis of Four Population-Based Databases.老年人慢性背痛与认知衰退的纵向关联及中介分析:基于四个群体数据库的分析
Eur J Pain. 2025 Sep;29(8):e70084. doi: 10.1002/ejp.70084.
4
Instability in the environment and children's in-school self-regulatory behaviors.环境的不稳定性与儿童在学校的自我调节行为。
Front Psychol. 2025 Mar 18;16:1498961. doi: 10.3389/fpsyg.2025.1498961. eCollection 2025.
5
Use of renin-angiotensin system blockers and posttraumatic stress disorder risk in the UK Biobank: a retrospective cohort study.使用肾素-血管紧张素系统阻滞剂与英国生物库中创伤后应激障碍风险的关系:一项回顾性队列研究。
BMC Med. 2024 Oct 23;22(1):489. doi: 10.1186/s12916-024-03704-5.
6
Understanding the implications of a complete case analysis for regression models with a right-censored covariate.理解对具有右删失协变量的回归模型进行完全病例分析的意义。
Am Stat. 2024;78(3):335-344. doi: 10.1080/00031305.2023.2282629. Epub 2023 Dec 21.
7
Testing the missing at random assumption in generalized linear models in the presence of instrumental variables.在存在工具变量的情况下检验广义线性模型中的随机缺失假设。
Scand Stat Theory Appl. 2024 Mar;51(1):334-354. doi: 10.1111/sjos.12685. Epub 2023 Aug 7.
8
Mother-daughter communication of sexual and reproductive health (SRH) matters and associated factors among sinhalese adolescent girls aged 14-19 years, in Sri Lanka.斯里兰卡 14-19 岁的僧伽罗少女的性与生殖健康(SRH)问题及其相关因素的母女沟通情况。
BMC Womens Health. 2023 Aug 31;23(1):461. doi: 10.1186/s12905-023-02617-4.
9
Understanding the Broader Impact of Stuttering: Suicidal Ideation.理解口吃的更广泛影响:自杀意念。
Am J Speech Lang Pathol. 2023 Sep 11;32(5):2087-2110. doi: 10.1044/2023_AJSLP-23-00007. Epub 2023 Jul 20.
10
Increasing efficiency of SVMp+ for handling missing values in healthcare prediction.提高SVMp+在医疗保健预测中处理缺失值的效率。
PLOS Digit Health. 2023 Jun 29;2(6):e0000281. doi: 10.1371/journal.pdig.0000281. eCollection 2023 Jun.

本文引用的文献

1
Review of inverse probability weighting for dealing with missing data.逆概率加权法处理缺失数据的综述。
Stat Methods Med Res. 2013 Jun;22(3):278-95. doi: 10.1177/0962280210395740. Epub 2011 Jan 10.
2
Bias and efficiency of multiple imputation compared with complete-case analysis for missing covariate values.缺失协变量值的多重插补与完全案例分析相比的偏差和效率。
Stat Med. 2010 Dec 10;29(28):2920-31. doi: 10.1002/sim.3944.
3
Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls.流行病学和临床研究中缺失数据的多重填补:潜力与陷阱
BMJ. 2009 Jun 29;338:b2393. doi: 10.1136/bmj.b2393.
4
Identifiability and estimation of causal effects in randomized trials with noncompliance and completely nonignorable missing data.存在不依从性和完全不可忽略缺失数据的随机试验中因果效应的可识别性与估计
Biometrics. 2009 Sep;65(3):675-82. doi: 10.1111/j.1541-0420.2008.01120.x. Epub 2008 Aug 28.
5
Analysis of semi-parametric regression models with non-ignorable non-response.具有不可忽略非应答的半参数回归模型分析
Stat Med. 1997;16(1-3):81-102. doi: 10.1002/(sici)1097-0258(19970115)16:1<81::aid-sim473>3.0.co;2-0.
6
Non-response models for the analysis of non-monotone ignorable missing data.用于分析非单调可忽略缺失数据的无应答模型。
Stat Med. 1997;16(1-3):39-56. doi: 10.1002/(sici)1097-0258(19970115)16:1<39::aid-sim535>3.0.co;2-d.