Suppr超能文献

通用数据的联合插补

Joint Imputation of General Data.

作者信息

Robbins Michael W

机构信息

Senior Statistician with the RAND Corporation, Pittsburgh, PA 15213, USA.

出版信息

J Surv Stat Methodol. 2023 Sep 12;12(1):183-210. doi: 10.1093/jssam/smad034. eCollection 2024 Feb.

Abstract

High-dimensional complex survey data of general structures (e.g., containing continuous, binary, categorical, and ordinal variables), such as the US Department of Defense's Health-Related Behaviors Survey (HRBS), often confound procedures designed to impute any missing survey data. Imputation by fully conditional specification (FCS) is often considered the state of the art for such datasets due to its generality and flexibility. However, FCS procedures contain a theoretical flaw that is exposed by HRBS data-HRBS imputations created with FCS are shown to diverge across iterations of Markov Chain Monte Carlo. Imputation by joint modeling lacks this flaw; however, current joint modeling procedures are neither general nor flexible enough to handle HRBS data. As such, we introduce an algorithm that efficiently and flexibly applies multiple imputation by joint modeling in data of general structures. This procedure draws imputations from a latent joint multivariate normal model that underpins the generally structured data and models the latent data via a sequence of conditional linear models, the predictors of which can be specified by the user. We perform rigorous evaluations of HRBS imputations created with the new algorithm and show that they are convergent and of high quality. Lastly, simulations verify that the proposed method performs well compared to existing algorithms including FCS.

摘要

一般结构的高维复杂调查数据(例如,包含连续、二元、分类和有序变量),如美国国防部的健康相关行为调查(HRBS),常常使旨在估算任何缺失调查数据的程序变得复杂。由于其通用性和灵活性,通过完全条件指定(FCS)进行插补通常被认为是处理此类数据集的先进方法。然而,FCS程序存在一个理论缺陷,这一缺陷在HRBS数据中暴露出来——用FCS创建的HRBS插补在马尔可夫链蒙特卡罗的迭代过程中会发散。通过联合建模进行插补不存在这个缺陷;然而,当前的联合建模程序在处理HRBS数据时既不够通用也不够灵活。因此,我们引入了一种算法,该算法能够在一般结构的数据中高效灵活地应用联合建模进行多次插补。此程序从一个潜在的联合多元正态模型中进行插补,该模型支撑着一般结构的数据,并通过一系列条件线性模型对潜在数据进行建模,用户可以指定这些模型的预测变量。我们对用新算法创建的HRBS插补进行了严格评估,结果表明它们是收敛的且质量很高。最后,模拟验证了与包括FCS在内的现有算法相比,所提出的方法表现良好。

相似文献

1
Joint Imputation of General Data.通用数据的联合插补
J Surv Stat Methodol. 2023 Sep 12;12(1):183-210. doi: 10.1093/jssam/smad034. eCollection 2024 Feb.

本文引用的文献

2
Multiple imputation in the presence of non-normal data.非正态数据情况下的多重填补
Stat Med. 2017 Feb 20;36(4):606-617. doi: 10.1002/sim.7173. Epub 2016 Nov 15.
6
Combining multiple imputation and inverse-probability weighting.结合多重填补法和逆概率加权法。
Biometrics. 2012 Mar;68(1):129-37. doi: 10.1111/j.1541-0420.2011.01666.x. Epub 2011 Nov 3.
8
Multiple imputation for missing data via sequential regression trees.基于序贯回归树的缺失数据多重插补法。
Am J Epidemiol. 2010 Nov 1;172(9):1070-6. doi: 10.1093/aje/kwq260. Epub 2010 Sep 14.
10
Multiple imputation in a large-scale complex survey: a practical guide.大规模复杂调查中的多重插补:实用指南。
Stat Methods Med Res. 2010 Dec;19(6):653-70. doi: 10.1177/0962280208101273. Epub 2009 Aug 4.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验