Suppr超能文献

用于可忽略缺失数据的双稳健非参数多重填补

Doubly Robust Nonparametric Multiple Imputation for Ignorable Missing Data.

作者信息

Long Qi, Hsu Chiu-Hsieh, Li Yisheng

机构信息

Emory University.

出版信息

Stat Sin. 2012;22:149-172.

Abstract

Missing data are common in medical and social science studies and often pose a serious challenge in data analysis. Multiple imputation methods are popular and natural tools for handling missing data, replacing each missing value with a set of plausible values that represent the uncertainty about the underlying values. We consider a case of missing at random (MAR) and investigate the estimation of the marginal mean of an outcome variable in the presence of missing values when a set of fully observed covariates is available. We propose a new nonparametric multiple imputation (MI) approach that uses two working models to achieve dimension reduction and define the imputing sets for the missing observations. Compared with existing nonparametric imputation procedures, our approach can better handle covariates of high dimension, and is doubly robust in the sense that the resulting estimator remains consistent if either of the working models is correctly specified. Compared with existing doubly robust methods, our nonparametric MI approach is more robust to the misspecification of both working models; it also avoids the use of inverse-weighting and hence is less sensitive to missing probabilities that are close to 1. We propose a sensitivity analysis for evaluating the validity of the working models, allowing investigators to choose the optimal weights so that the resulting estimator relies either completely or more heavily on the working model that is likely to be correctly specified and achieves improved efficiency. We investigate the asymptotic properties of the proposed estimator, and perform simulation studies to show that the proposed method compares favorably with some existing methods in finite samples. The proposed method is further illustrated using data from a colorectal adenoma study.

摘要

缺失数据在医学和社会科学研究中很常见,并且在数据分析中常常构成严峻挑战。多重填补方法是处理缺失数据的常用且自然的工具,它用一组合理的值替代每个缺失值,这些值代表了潜在值的不确定性。我们考虑随机缺失(MAR)的情况,并研究当有一组完全观测到的协变量时,存在缺失值情况下结果变量边际均值的估计。我们提出一种新的非参数多重填补(MI)方法,该方法使用两个工作模型来实现降维,并为缺失观测定义填补集。与现有的非参数填补程序相比,我们的方法能够更好地处理高维协变量,并且具有双重稳健性,即如果两个工作模型中有一个被正确设定,所得估计量仍保持一致性。与现有的双重稳健方法相比,我们的非参数MI方法对两个工作模型的错误设定更具稳健性;它还避免了使用逆加权,因此对接近1的缺失概率不太敏感。我们提出一种敏感性分析来评估工作模型的有效性,使研究者能够选择最优权重,从而使所得估计量完全或更主要地依赖于可能被正确设定的工作模型,并提高效率。我们研究了所提出估计量的渐近性质,并进行模拟研究以表明所提方法在有限样本中与一些现有方法相比具有优势。使用来自一项结肠直肠腺瘤研究的数据进一步说明了所提方法。

相似文献

9
Doubly robust multiple imputation using kernel-based techniques.使用基于核技术的双重稳健多重填补
Biom J. 2016 May;58(3):588-606. doi: 10.1002/bimj.201400256. Epub 2015 Dec 9.

引用本文的文献

2
Inference from Nonrandom Samples Using Bayesian Machine Learning.使用贝叶斯机器学习从非随机样本进行推断。
J Surv Stat Methodol. 2022 Jan 20;11(2):433-455. doi: 10.1093/jssam/smab049. eCollection 2023 Apr.
10
Doubly robust multiple imputation using kernel-based techniques.使用基于核技术的双重稳健多重填补
Biom J. 2016 May;58(3):588-606. doi: 10.1002/bimj.201400256. Epub 2015 Dec 9.

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验