Department of Mathematics and Statistics, Villanova University, Villanova, PA 19085, USA.
Stat Med. 2018 Apr 15;37(8):1325-1342. doi: 10.1002/sim.7584. Epub 2018 Jan 9.
Missing covariate values are prevalent in regression applications. While an array of methods have been developed for estimating parameters in regression models with missing covariate data for a variety of response types, minimal focus has been given to validation of the response model and influence diagnostics. Previous research has mainly focused on estimating residuals for observations with missing covariates using expected values, after which specialized techniques are needed to conduct proper inference. We suggest a multiple imputation strategy that allows for the use of standard methods for residual analyses on the imputed data sets or a stacked data set. We demonstrate the suggested multiple imputation method by analyzing the Sleep in Mammals data in the context of a linear regression model and the New York Social Indicators Status data with a logistic regression model.
在回归应用中,缺失的协变量值很常见。虽然已经开发了许多方法来估计具有缺失协变量数据的回归模型的参数,适用于各种响应类型,但对响应模型的验证和影响诊断的关注很少。以前的研究主要集中在使用期望值估计缺失协变量观测值的残差,之后需要专门的技术来进行适当的推断。我们建议使用多重插补策略,允许在插补数据集或堆叠数据集中使用标准的残差分析方法。我们通过在线性回归模型中分析哺乳动物睡眠数据和逻辑回归模型中的纽约社会指标状态数据,展示了所建议的多重插补方法。