Suppr超能文献

超级小鼠:一种基于链式方程的多重填补集成机器学习方法。

SuperMICE: An Ensemble Machine Learning Approach to Multiple Imputation by Chained Equations.

作者信息

Laqueur Hannah S, Shev Aaron B, Kagawa Rose M C

出版信息

Am J Epidemiol. 2022 Feb 19;191(3):516-525. doi: 10.1093/aje/kwab271.

Abstract

Researchers often face the problem of how to address missing data. Multiple imputation is a popular approach, with multiple imputation by chained equations (MICE) being among the most common and flexible methods for execution. MICE iteratively fits a predictive model for each variable with missing values, conditional on other variables in the data. In theory, any imputation model can be used to predict the missing values. However, if the predictive models are incorrectly specified, they may produce biased estimates of the imputed data, yielding inconsistent parameter estimates and invalid inference. Given the set of modeling choices that must be made in conducting multiple imputation, in this paper we propose a data-adaptive approach to model selection. Specifically, we adapt MICE to incorporate an ensemble algorithm, Super Learner, to predict the conditional mean for each missing value, and we also incorporate a local kernel-based estimate of variance. We present a set of simulations indicating that this approach produces final parameter estimates with lower bias and better coverage than other commonly used imputation methods. These results suggest that using a flexible machine learning imputation approach can be useful in settings where data are missing at random, especially when the relationships among the variables are complex.

摘要

研究人员经常面临如何处理缺失数据的问题。多重填补是一种常用方法,其中链式方程多重填补(MICE)是最常见且灵活的执行方法之一。MICE会针对每个具有缺失值的变量,在数据中的其他变量条件下迭代拟合一个预测模型。理论上,任何填补模型都可用于预测缺失值。然而,如果预测模型指定错误,它们可能会对填补数据产生有偏差的估计,导致参数估计不一致且推断无效。鉴于在进行多重填补时必须做出一系列建模选择,在本文中我们提出一种数据自适应的模型选择方法。具体而言,我们对MICE进行调整,纳入一种集成算法——超级学习器,以预测每个缺失值的条件均值,并且我们还纳入了基于局部核的方差估计。我们给出了一组模拟结果,表明该方法比其他常用的填补方法能产生偏差更小且覆盖性更好的最终参数估计。这些结果表明,在数据随机缺失的情况下,尤其是当变量之间的关系复杂时,使用灵活的机器学习填补方法可能会很有用。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验