Suppr超能文献

基于高斯分布的方法来推断健康调查中的分类变量。

Gaussian-based routines to impute categorical variables in health surveys.

机构信息

Department of Epidemiology and Biostatistics, School of Public Health, University at Albany, SUNY, One University Place, Rensselaer, NY 12144-3456, USA.

出版信息

Stat Med. 2011 Dec 20;30(29):3447-60. doi: 10.1002/sim.4355. Epub 2011 Oct 4.

Abstract

The multivariate normal (MVN) distribution is arguably the most popular parametric model used in imputation and is available in most software packages (e.g., SAS PROC MI, R package norm). When it is applied to categorical variables as an approximation, practitioners often either apply simple rounding techniques for ordinal variables or create a distinct 'missing' category and/or disregard the nominal variable from the imputation phase. All of these practices can potentially lead to biased and/or uninterpretable inferences. In this work, we develop a new rounding methodology calibrated to preserve observed distributions to multiply impute missing categorical covariates. The major attractiveness of this method is its flexibility to use any 'working' imputation software, particularly those based on MVN, allowing practitioners to obtain usable imputations with small biases. A simulation study demonstrates the clear advantage of the proposed method in rounding ordinal variables and, in some scenarios, its plausibility in imputing nominal variables. We illustrate our methods on a widely used National Survey of Children with Special Health Care Needs where incomplete values on race posed a valid threat on inferences pertaining to disparities.

摘要

多变量正态(MVN)分布可以说是在插补中使用最广泛的参数模型,并且大多数软件包(例如 SAS PROC MI、R 包 norm)都提供了该模型。当将其应用于分类变量作为近似值时,从业者通常要么对有序变量应用简单的舍入技术,要么创建一个独特的“缺失”类别,并/或在插补阶段忽略名义变量。所有这些做法都可能导致有偏差和/或不可解释的推断。在这项工作中,我们开发了一种新的舍入方法,该方法经过校准,可以保留观察到的分布,以便对缺失的分类协变量进行多重插补。这种方法的主要吸引力在于其灵活性,可以使用任何“工作”的插补软件,特别是基于 MVN 的软件,从而允许从业者以较小的偏差获得可用的插补值。一项模拟研究表明,该方法在舍入有序变量方面具有明显的优势,并且在某些情况下,在对名义变量进行插补方面也具有合理性。我们在广泛使用的具有特殊健康需求的儿童全国调查中说明了我们的方法,其中种族的不完整值对与差异相关的推断构成了合理的威胁。

相似文献

10
Multiple imputation in the presence of non-normal data.非正态数据情况下的多重填补
Stat Med. 2017 Feb 20;36(4):606-617. doi: 10.1002/sim.7173. Epub 2016 Nov 15.

引用本文的文献

5
Evaluation of approaches for multiple imputation of three-level data.三水平数据的多重插补方法评价。
BMC Med Res Methodol. 2020 Aug 12;20(1):207. doi: 10.1186/s12874-020-01079-8.
8
Model checking in multiple imputation: an overview and case study.多重填补中的模型检验:综述与案例研究
Emerg Themes Epidemiol. 2017 Aug 23;14:8. doi: 10.1186/s12982-017-0062-6. eCollection 2017.

本文引用的文献

3
Unmet need among children with special health care needs in Massachusetts.马萨诸塞州有特殊医疗需求儿童的未满足需求。
Matern Child Health J. 2008 Sep;12(5):650-61. doi: 10.1007/s10995-007-0283-3. Epub 2007 Sep 25.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验