Suppr超能文献

系统和偶发性缺失数据的分层插补:一种使用链式方程的近似贝叶斯方法。

Hierarchical imputation of systematically and sporadically missing data: An approximate Bayesian approach using chained equations.

作者信息

Jolani Shahab

机构信息

Department of Methodology and Statistics, CAPHRI, Maastricht University, 6229, HA, Maastricht, The Netherlands.

出版信息

Biom J. 2018 Mar;60(2):333-351. doi: 10.1002/bimj.201600220. Epub 2017 Oct 9.

Abstract

In health and medical sciences, multiple imputation (MI) is now becoming popular to obtain valid inferences in the presence of missing data. However, MI of clustered data such as multicenter studies and individual participant data meta-analysis requires advanced imputation routines that preserve the hierarchical structure of data. In clustered data, a specific challenge is the presence of systematically missing data, when a variable is completely missing in some clusters, and sporadically missing data, when it is partly missing in some clusters. Unfortunately, little is known about how to perform MI when both types of missing data occur simultaneously. We develop a new class of hierarchical imputation approach based on chained equations methodology that simultaneously imputes systematically and sporadically missing data while allowing for arbitrary patterns of missingness among them. Here, we use a random effect imputation model and adopt a simplification over fully Bayesian techniques such as Gibbs sampler to directly obtain draws of parameters within each step of the chained equations. We justify through theoretical arguments and extensive simulation studies that the proposed imputation methodology has good statistical properties in terms of bias and coverage rates of parameter estimates. An illustration is given in a case study with eight individual participant datasets.

摘要

在健康与医学科学领域,多重填补(MI)如今在处理存在缺失数据的情况下获取有效推断时变得越来越流行。然而,对于多中心研究和个体参与者数据荟萃分析等聚类数据的多重填补,需要先进的填补程序来保留数据的层次结构。在聚类数据中,一个特殊的挑战是存在系统性缺失数据(即某个变量在某些聚类中完全缺失)和偶发性缺失数据(即该变量在某些聚类中部分缺失)。不幸的是,对于这两种类型的缺失数据同时出现时如何进行多重填补,人们了解甚少。我们基于链式方程方法开发了一种新的层次填补方法,该方法能同时填补系统性和偶发性缺失数据,同时允许它们之间存在任意的缺失模式。在此,我们使用随机效应填补模型,并对诸如吉布斯采样器等全贝叶斯技术进行简化,以便在链式方程的每个步骤中直接获取参数的抽样值。我们通过理论论证和广泛的模拟研究证明,所提出的填补方法在参数估计的偏差和覆盖率方面具有良好的统计特性。在一个包含八个个体参与者数据集的案例研究中给出了一个示例。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验