Suppr超能文献

结果敏感多重填补:一项模拟研究。

Outcome-sensitive multiple imputation: a simulation study.

作者信息

Kontopantelis Evangelos, White Ian R, Sperrin Matthew, Buchan Iain

机构信息

The Farr Institute for Health Informatics Research, University of Manchester, Vaughan House, Manchester, M13 9GB, UK.

NIHR School for Primary Care Research, Centre for Primary Care, Institute of Population Health, University of Manchester, Manchester, UK.

出版信息

BMC Med Res Methodol. 2017 Jan 9;17(1):2. doi: 10.1186/s12874-016-0281-5.

Abstract

BACKGROUND

Multiple imputation is frequently used to deal with missing data in healthcare research. Although it is known that the outcome should be included in the imputation model when imputing missing covariate values, it is not known whether it should be imputed. Similarly no clear recommendations exist on: the utility of incorporating a secondary outcome, if available, in the imputation model; the level of protection offered when data are missing not-at-random; the implications of the dataset size and missingness levels.

METHODS

We used realistic assumptions to generate thousands of datasets across a broad spectrum of contexts: three mechanisms of missingness (completely at random; at random; not at random); varying extents of missingness (20-80% missing data); and different sample sizes (1,000 or 10,000 cases). For each context we quantified the performance of a complete case analysis and seven multiple imputation methods which deleted cases with missing outcome before imputation, after imputation or not at all; included or did not include the outcome in the imputation models; and included or did not include a secondary outcome in the imputation models. Methods were compared on mean absolute error, bias, coverage and power over 1,000 datasets for each scenario.

RESULTS

Overall, there was very little to separate multiple imputation methods which included the outcome in the imputation model. Even when missingness was quite extensive, all multiple imputation approaches performed well. Incorporating a secondary outcome, moderately correlated with the outcome of interest, made very little difference. The dataset size and the extent of missingness affected performance, as expected. Multiple imputation methods protected less well against missingness not at random, but did offer some protection.

CONCLUSIONS

As long as the outcome is included in the imputation model, there are very small performance differences between the possible multiple imputation approaches: no outcome imputation, imputation or imputation and deletion. All informative covariates, even with very high levels of missingness, should be included in the multiple imputation model. Multiple imputation offers some protection against a simple missing not at random mechanism.

摘要

背景

多重填补常用于处理医疗保健研究中的缺失数据。虽然已知在填补缺失的协变量值时应将结局纳入填补模型,但尚不清楚结局本身是否也应进行填补。同样,对于以下方面也没有明确建议:在填补模型中纳入次要结局(如果有)的效用;数据非随机缺失时提供的保护水平;数据集大小和缺失水平的影响。

方法

我们使用现实的假设生成了数千个涵盖广泛背景的数据集:三种缺失机制(完全随机;随机;非随机);不同程度的缺失(20%-80%的数据缺失);以及不同的样本量(1000例或10000例)。对于每种背景,我们量化了完整病例分析和七种多重填补方法的性能,这些方法在填补前、填补后或根本不删除有缺失结局的病例;在填补模型中纳入或不纳入结局;以及在填补模型中纳入或不纳入次要结局。在每种情况下,针对1000个数据集,比较了这些方法在平均绝对误差、偏差、覆盖率和检验效能方面的表现。

结果

总体而言,在填补模型中纳入结局的多重填补方法之间几乎没有差异。即使缺失程度相当广泛,所有多重填补方法的表现都很好。纳入与感兴趣结局中度相关的次要结局,差异也很小。正如预期的那样,数据集大小和缺失程度会影响性能。多重填补方法对非随机缺失的保护效果较差,但确实提供了一些保护。

结论

只要结局被纳入填补模型,可能的多重填补方法(不填补结局、填补结局或填补并删除结局)之间的性能差异就非常小。所有信息性协变量,即使缺失程度很高,也应纳入多重填补模型。多重填补对简单的非随机缺失机制提供了一些保护。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ac4e/5220613/4cb975503eb2/12874_2016_281_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验