逆概率加权法的不稳定性及对不可忽略缺失数据的补救措施。

Instability of inverse probability weighting methods and a remedy for nonignorable missing data.

机构信息

Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada.

National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland, USA.

出版信息

Biometrics. 2023 Dec;79(4):3215-3226. doi: 10.1111/biom.13881. Epub 2023 May 23.

DOI:10.1111/biom.13881

PMID:37221141

Abstract

Inverse probability weighting (IPW) methods are commonly used to analyze nonignorable missing data (NIMD) under the assumption of a logistic model for the missingness probability. However, solving IPW equations numerically may involve nonconvergence problems when the sample size is moderate and the missingness probability is high. Moreover, those equations often have multiple roots, and identifying the best root is challenging. Therefore, IPW methods may have low efficiency or even produce biased results. We identify the pitfall in these methods pathologically: they involve the estimation of a moment-generating function (MGF), and such functions are notoriously unstable in general. As a remedy, we model the outcome distribution given the covariates of the completely observed individuals semiparametrically. After forming an induced logistic regression (LR) model for the missingness status of the outcome and covariate, we develop a maximum conditional likelihood method to estimate the underlying parameters. The proposed method circumvents the estimation of an MGF and hence overcomes the instability of IPW methods. Our theoretical and simulation results show that the proposed method outperforms existing competitors greatly. Two real data examples are analyzed to illustrate the advantages of our method. We conclude that if only a parametric LR is assumed but the outcome regression model is left arbitrary, then one has to be cautious in using any of the existing statistical methods in problems involving NIMD.

摘要

逆概率加权（Inverse probability weighting，简称 IPW）方法常用于在缺失概率的逻辑模型假设下分析不可忽略的缺失数据（Nonignorable missing data，简称 NIMD）。然而，当样本量适中且缺失概率较高时，通过数值求解 IPW 方程可能会遇到不收敛的问题。此外，这些方程通常有多个根，确定最佳根具有挑战性。因此，IPW 方法可能效率低下，甚至产生有偏的结果。我们从病理学角度发现了这些方法的缺陷：它们涉及矩生成函数（Moment-generating function，简称 MGF）的估计，而一般来说，这些函数非常不稳定。作为一种补救措施，我们对半参数地对完全观测个体的协变量进行建模，以预测结果的分布。在形成用于预测结果和协变量缺失状态的诱导逻辑回归（Induced logistic regression，简称 LR）模型之后，我们开发了一种最大条件似然方法来估计潜在参数。所提出的方法避免了 MGF 的估计，从而克服了 IPW 方法的不稳定性。我们的理论和模拟结果表明，所提出的方法大大优于现有的竞争方法。通过分析两个真实数据示例来说明我们方法的优势。我们的结论是，如果仅假设参数 LR，但不考虑结果回归模型，那么在涉及 NIMD 的问题中使用任何现有的统计方法都需要谨慎。

相似文献

Instability of inverse probability weighting methods and a remedy for nonignorable missing data.

Biometrics. 2023 Dec;79(4):3215-3226. doi: 10.1111/biom.13881. Epub 2023 May 23.

Empirical Likelihood in Nonignorable Covariate-Missing Data Problems.

Int J Biostat. 2017 Apr 20;13(1):/j/ijb.2017.13.issue-1/ijb-2016-0053/ijb-2016-0053.xml. doi: 10.1515/ijb-2016-0053.

Evaluation of predictive model performance of an existing model in the presence of missing data.

Stat Med. 2021 Jul 10;40(15):3477-3498. doi: 10.1002/sim.8978. Epub 2021 Apr 11.

Constrained empirical-likelihood confidence regions in nonignorable covariate-missing data problems.

Stat Med. 2019 Feb 10;38(3):452-479. doi: 10.1002/sim.7987. Epub 2018 Oct 11.

Comparison between inverse-probability weighting and multiple imputation in Cox model with missing failure subtype.

Stat Methods Med Res. 2024 Feb;33(2):344-356. doi: 10.1177/09622802231226328. Epub 2024 Jan 23.

On variance estimation of target population created by inverse probability weighting.

J Biopharm Stat. 2024 Aug;34(5):661-679. doi: 10.1080/10543406.2023.2244593. Epub 2023 Aug 24.

A Two-Step Approach for Analysis of Nonignorable Missing Outcomes in Longitudinal Regression: an Application to Upstate KIDS Study.

Paediatr Perinat Epidemiol. 2017 Sep;31(5):468-478. doi: 10.1111/ppe.12382. Epub 2017 Aug 2.

On the use of multiple imputation to address data missing by design as well as unintended missing data in case-cohort studies with a binary endpoint.

BMC Med Res Methodol. 2023 Dec 7;23(1):287. doi: 10.1186/s12874-023-02090-5.

Inverse probability weighting methods for Cox regression with right-truncated data.

Biometrics. 2020 Jun;76(2):484-495. doi: 10.1111/biom.13162. Epub 2019 Nov 11.

A nonparametric multiple imputation approach for missing categorical data.

BMC Med Res Methodol. 2017 Jun 6;17(1):87. doi: 10.1186/s12874-017-0360-2.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

逆概率加权法的不稳定性及对不可忽略缺失数据的补救措施。

Instability of inverse probability weighting methods and a remedy for nonignorable missing data.

机构信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献