Begg M D, Lagakos S
Division of Biostatistics, Columbia University, School of Public Health, New York, NY 10032.
Environ Health Perspect. 1990 Jul;87:69-75. doi: 10.1289/ehp.908769.
Logistic regression models are commonly used to study the association between a binary response variable and an exposure variable. Besides the exposure of interest, other covariates are frequently included in the fitted model in order to control for their effects on outcome. Unfortunately, misspecification of the main exposure variable and the other covariates is not uncommon, and this can adversely affect tests of the association between the exposure and response. We allow the term "misspecification" to cover a broad range of modeling errors including measurement errors, discretizing continuous explanatory variables, and completely excluding covariates from the model. This paper reviews some recent results on the consequences of model misspecification on the large sample properties of likelihood score tests of association between exposure and response.
逻辑回归模型常用于研究二元响应变量与暴露变量之间的关联。除了感兴趣的暴露因素外,拟合模型中还经常纳入其他协变量,以控制它们对结果的影响。不幸的是,主要暴露变量和其他协变量的错误设定并不罕见,这可能会对暴露与响应之间关联的检验产生不利影响。我们将“错误设定”一词涵盖广泛的建模误差,包括测量误差、对连续解释变量进行离散化以及完全将协变量排除在模型之外。本文回顾了一些关于模型错误设定对暴露与响应之间关联的似然得分检验大样本性质影响的最新结果。