Department of Educational Measurement, Leibniz Institute for Science and Mathematics Education.
Department of Psychology, Arizona State University.
Psychol Methods. 2020 Apr;25(2):157-181. doi: 10.1037/met0000233. Epub 2019 Sep 2.
When estimating multiple regression models with incomplete predictor variables, it is necessary to specify a joint distribution for the predictor variables. A convenient assumption is that this distribution is a joint normal distribution, the default in many statistical software packages. This distribution will in general be misspecified if the predictors with missing data have nonlinear effects (e.g., x2) or are included in interaction terms (e.g., x·z). In the present article, we discuss a sequential modeling approach that can be applied to decompose the joint distribution of the variables into 2 parts: (a) a part that is due to the model of interest and (b) a part that is due to the model for the incomplete predictors. We demonstrate how the sequential modeling approach can be used to implement a multiple imputation strategy based on Bayesian estimation techniques that can accommodate rather complex substantive regression models with nonlinear effects and also allows a flexible treatment of auxiliary variables. In 4 simulation studies, we showed that the sequential modeling approach can be applied to estimate nonlinear effects in regression models with missing values on continuous, categorical, or skewed predictor variables under a broad range of conditions and investigated the robustness of the proposed approach against distributional misspecifications. We developed the R package mdmb, which facilitates a user-friendly application of the sequential modeling approach, and we present a real-data example that illustrates the flexibility of the software. (PsycINFO Database Record (c) 2020 APA, all rights reserved).
当估计具有不完全预测变量的多元回归模型时,有必要指定预测变量的联合分布。一个方便的假设是,该分布是联合正态分布,这是许多统计软件包的默认分布。如果具有缺失数据的预测变量具有非线性效应(例如 x2)或包含在交互项中(例如 x·z),则该分布通常会被误指定。在本文中,我们讨论了一种顺序建模方法,可用于将变量的联合分布分解为 2 部分:(a)由于感兴趣的模型而产生的部分和(b)由于不完整预测器的模型而产生的部分。我们展示了如何使用顺序建模方法来实现基于贝叶斯估计技术的多重插补策略,该策略可以适应具有非线性效应的相当复杂的实质性回归模型,并且还允许灵活处理辅助变量。在 4 项模拟研究中,我们表明,顺序建模方法可以应用于在多种条件下估计具有缺失值的连续、分类或偏态预测变量的回归模型中的非线性效应,并研究了该方法对分布误指定的稳健性。我们开发了 R 包 mdmb,它方便了顺序建模方法的用户友好应用,并提出了一个真实数据示例,说明了该软件的灵活性。(PsycINFO 数据库记录(c)2020 APA,保留所有权利)。