Veierød M B, Laake P
Section of Medical Statistics, University of Oslo, P.O. Box 1122, Blindern, 0317 Oslo, Norway.
Stat Med. 2001 Mar 15;20(5):771-84. doi: 10.1002/sim.712.
In epidemiologic studies of the association between exposure and disease, misclassification of exposure is common and known to induce bias in the effect estimates. The nature of the bias is difficult to foretell. For this purpose, we present a simple method to assess the bias in Poisson regression coefficients for a categorical exposure variable subject to misclassification. We derive expressions for the category specific coefficients from the regression on the error-prone exposure (naive coefficients) in terms of the coefficients from the regression on the true exposure (true coefficients). These expressions are similar for crude and adjusted models, if we assume that the covariates are measured without error and that it is independence between the misclassification probabilities and covariate values. We find that the bias in the naive coefficient for one category of the exposure variable depends on all true category specific coefficients weighted by misclassification probabilities. On the other hand, misclassification of an exposure variable does not induce bias in the estimates of the coefficients of the (perfectly measured) covariates. Similarities with linear regression models are pointed out. For selected scenarios of true exposure-disease associations and selected patterns of misclassification, we illustrate the inconsistency in naive Poisson regression coefficients and show that it can be difficult to intuitively characterize the nature of the bias. Both the magnitude and the direction of the bias may vary between categories of an exposure variable.
在暴露与疾病关联的流行病学研究中,暴露的错误分类很常见,并且已知会在效应估计中导致偏差。偏差的性质很难预测。为此,我们提出一种简单方法,用于评估存在错误分类的分类暴露变量在泊松回归系数中的偏差。我们根据真实暴露回归(真实系数)中的系数,从易出错暴露回归(朴素系数)中推导出类别特定系数的表达式。如果我们假设协变量的测量没有误差,并且错误分类概率与协变量值之间相互独立,那么这些表达式在粗模型和调整模型中是相似的。我们发现,暴露变量某一类别朴素系数中的偏差取决于所有按错误分类概率加权的真实类别特定系数。另一方面,暴露变量的错误分类不会在(测量无误的)协变量系数估计中导致偏差。文中指出了与线性回归模型的相似之处。对于真实暴露 - 疾病关联的选定场景和选定的错误分类模式,我们说明了朴素泊松回归系数中的不一致性,并表明很难直观地描述偏差的性质。偏差的大小和方向在暴露变量的不同类别之间可能会有所不同。