D'Angelo Gina, Weissfeld Lisa
Department of Biostatistics, University of Pittsburgh Graduate School of Public Health, Pittsburgh, PA 15261, USA.
Stat Med. 2007 May 10;26(10):2137-53. doi: 10.1002/sim.2686.
This paper addresses the modelling of missing covariate data with the logistic regression model. The aim of this paper is to evaluate the properties of an efficient score for logistic regression in a two-phase design. Simulation studies show that the efficient score is more efficient than two other pseudo-likelihood methods when the correlation between the missing covariate and its surrogate is high or the sampling proportion is small. These methods are illustrated with data from the National Wilms Tumor Study Group. Results from the example confirm the simulation study findings with the exception that the pseudo-likelihood approach produces more reliable estimates than the weighted pseudo-likelihood approach.
本文探讨了使用逻辑回归模型对缺失协变量数据进行建模的问题。本文的目的是评估两阶段设计中逻辑回归有效得分的性质。模拟研究表明,当缺失协变量与其替代变量之间的相关性较高或抽样比例较小时,有效得分比其他两种伪似然方法更有效。本文用来自国家肾母细胞瘤研究组的数据对这些方法进行了说明。该实例的结果证实了模拟研究的发现,但伪似然方法比加权伪似然方法产生更可靠估计这一情况除外。