Department of Mathematics and Statistics, The University of Toledo, Toledo, Ohio.
Stat Med. 2019 Feb 10;38(3):452-479. doi: 10.1002/sim.7987. Epub 2018 Oct 11.
Missing covariates in regression analysis are a pervasive problem in medical, social, and economic researches. We study empirical-likelihood confidence regions for unconstrained and constrained regression parameters in a nonignorable covariate-missing data problem. For an assumed conditional mean regression model, we assume that some covariates are fully observed but other covariates are missing for some subjects. By exploitation of a probability model of missingness and a working conditional score model from a semiparametric perspective, we build a system of unbiased estimating equations, where the number of equations exceeds the number of unknown parameters. Based on the proposed estimating equations, we introduce unconstrained and constrained empirical-likelihood ratio statistics to construct empirical-likelihood confidence regions for the underlying regression parameters without and with constraints. We establish the asymptotic distributions of the proposed empirical-likelihood ratio statistics. Simulation results show that the proposed empirical-likelihood methods have a better finite-sample performance than other competitors in terms of coverage probability and interval length. Finally, we apply the proposed empirical-likelihood methods to the analysis of a data set from the US National Health and Nutrition Examination Survey.
回归分析中缺失的协变量是医学、社会和经济研究中普遍存在的问题。我们研究了在不可忽略的协变量缺失数据问题中无约束和约束回归参数的经验似然置信区间。对于假设的条件均值回归模型,我们假设某些协变量是完全观测的,但其他协变量对某些受试者是缺失的。通过利用缺失概率模型和半参数视角下的工作条件得分模型,我们构建了一个无偏估计方程组,其中方程的数量超过未知参数的数量。基于提出的估计方程组,我们引入了无约束和约束经验似然比统计量,以构建无约束和约束下基础回归参数的经验似然置信区间。我们建立了所提出的经验似然比统计量的渐近分布。模拟结果表明,在所提出的经验似然方法中,与其他竞争者相比,基于覆盖概率和区间长度,它们具有更好的有限样本性能。最后,我们将所提出的经验似然方法应用于来自美国国家健康和营养检查调查的数据集的分析。