Suppr超能文献

缺失数据的引导多重填补:利用子样本来强化随机缺失假设。

Guided multiple imputation of missing data: using a subsample to strengthen the missing-at-random assumption.

作者信息

Fraser Gary, Yan Ru

机构信息

Loma Linda University, Loma Linda, California, USA.

出版信息

Epidemiology. 2007 Mar;18(2):246-52. doi: 10.1097/01.ede.0000254708.40228.8b.

Abstract

Multiple imputation can be a good solution to handling missing data if data are missing at random. However, this assumption is often difficult to verify. We describe an application of multiple imputation that makes this assumption plausible. This procedure requires contacting a random sample of subjects with incomplete data to fill in the missing information, and then adjusting the imputation model to incorporate the new data. Simulations with missing data that were decidedly not missing at random showed, as expected, that the method restored the original beta coefficients, whereas other methods of dealing with missing data failed. Using a dataset with real missing data, we found that different approaches to imputation produced moderately different results. Simulations suggest that filling in 10% of data that was initially missing is sufficient for imputation in many epidemiologic applications, and should produce approximately unbiased results, provided there is a high response on follow-up from the subsample of those with some originally missing data. This response can probably be achieved if this data collection is planned as an initial approach to dealing with the missing data, rather than at later stages, after further attempts that leave only data that is very difficult to complete.

摘要

如果数据是随机缺失的,多重填补可能是处理缺失数据的一个好方法。然而,这一假设往往很难验证。我们描述了一种多重填补的应用,它使这一假设变得合理。该过程需要联系一个随机抽取的、数据不完整的研究对象样本以填补缺失信息,然后调整填补模型以纳入新数据。对明显不是随机缺失的缺失数据进行模拟,不出所料,结果显示该方法恢复了原始的β系数,而其他处理缺失数据的方法则失败了。使用一个存在实际缺失数据的数据集,我们发现不同的填补方法产生的结果略有不同。模拟表明,在许多流行病学应用中,填补10%最初缺失的数据就足以进行填补,并且如果对那些最初有一些缺失数据的子样本的随访有较高的应答率,应该会产生大致无偏的结果。如果将这种数据收集计划作为处理缺失数据的初始方法,而不是在后续阶段,即在经过进一步尝试后只剩下非常难以完成的数据时进行,那么可能会实现这种应答率。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验