Buonaccorsi John P, Laake Petter, Veierød Marit B
Department of Mathematics and Statistics, University of Massachusetts, Amherst, Massachusetts 01003, USA.
Biometrics. 2005 Sep;61(3):831-6. doi: 10.1111/j.1541-0420.2005.00336.x.
This note clarifies under what conditions a naive analysis using a misclassified predictor will induce bias for the regression coefficients of other perfectly measured predictors in the model. An apparent discrepancy between some previous results and a result for measurement error of a continuous variable in linear regression is resolved. We show that similar to the linear setting, misclassification (even when not related to the other predictors) induces bias in the coefficients of the perfectly measured predictors, unless the misclassified variable and the perfectly measured predictors are independent. Conditional and asymptotic biases are discussed in the case of linear regression, and explored numerically for an example relating birth weight to the weight and smoking status of the mother.
本笔记阐明了在何种条件下,使用错误分类预测变量的简单分析会对模型中其他完全测量的预测变量的回归系数产生偏差。先前一些结果与线性回归中连续变量测量误差的一个结果之间的明显差异得到了解决。我们表明,与线性设定类似,错误分类(即使与其他预测变量无关)会导致完全测量的预测变量系数出现偏差,除非错误分类变量与完全测量的预测变量相互独立。在线性回归的情况下讨论了条件偏差和渐近偏差,并通过一个将出生体重与母亲体重和吸烟状况相关联的例子进行了数值探索。