Rosner B, Spiegelman D, Willett W C
Department of Preventive Medicine, Harvard Medical School, Boston, MA.
Am J Epidemiol. 1990 Oct;132(4):734-45. doi: 10.1093/oxfordjournals.aje.a115715.
If several risk factors for disease are considered in the same multiple logistic regression model, and some of these risk factors are measured with error, the point and interval estimates of relative risk corresponding to any of these factors may be biased either toward or away from the null value. A method is provided for correcting point and interval estimates of relative risk obtained from logistic regression for measurement error in one or more continuous variables. The method requires a separate validation study to estimate the coefficients from the multivariate linear regression model relating the surrogate variables to the vector of true risk factors. Similar methods have been suggested by other authors, but none provides a means of correcting the confidence intervals which include a component of variability due to estimation of the measurement error parameters from a validation study. An example is provided from a prospective study of dietary fat, calories, and alcohol in relation to breast cancer, and from a validation study of the questionnaire used to assess these nutrients. Before correcting for measurement error, the age-adjusted relative risk for a 25 g increment in alcohol intake was 1.33 (95% confidence interval (CI) 1.14-1.55); after correcting for measurement error, the relative risk increased to 1.62 (95% CI 1.23-2.12). Similarly, for a 10 g increment in saturated fat intake, the age-adjusted relative risk was 0.94 (95% CI 0.83-1.06); after correcting for measurement error, the relative risk was 0.84 (95% CI 0.59-1.20). These results indicate that the failure to find a substantial positive association between breast cancer risk and saturated fat intake cannot be explained by measurement error in fat, calories, or alcohol.
如果在同一个多元逻辑回归模型中考虑多种疾病风险因素,并且其中一些风险因素存在测量误差,那么与这些因素中任何一个相对应的相对风险的点估计和区间估计可能会偏向或远离零值。本文提供了一种方法,用于校正逻辑回归中因一个或多个连续变量的测量误差而获得的相对风险的点估计和区间估计。该方法需要一项单独的验证研究,以估计将替代变量与真实风险因素向量相关联的多元线性回归模型的系数。其他作者也提出过类似方法,但都没有提供一种校正置信区间的方法,因为这些置信区间包含了因从验证研究中估计测量误差参数而产生的变异性成分。本文给出了一个前瞻性研究的例子,该研究涉及膳食脂肪、卡路里和酒精与乳腺癌的关系,以及用于评估这些营养素的问卷的验证研究。在校正测量误差之前,酒精摄入量每增加25克,年龄调整后的相对风险为1.33(95%置信区间(CI)1.14 - 1.55);校正测量误差后,相对风险增加到1.62(95% CI 1.23 - 2.12)。同样,饱和脂肪摄入量每增加10克,年龄调整后的相对风险为0.94(95% CI 0.83 - 1.06);校正测量误差后,相对风险为0.84(95% CI 0.59 - 1.20)。这些结果表明,未能发现乳腺癌风险与饱和脂肪摄入量之间存在显著正相关关系,不能用脂肪、卡路里或酒精的测量误差来解释。