Suppr超能文献

比较连续和离散协变量测量误差的影响,重点关注测量误差预测变量的二分法。

Comparing the effects of continuous and discrete covariate mismeasurement, with emphasis on the dichotomization of mismeasured predictors.

作者信息

Gustafson Paul, Le Nhu D

机构信息

Department of Statistics, University of British Columbia, Vancouver, British Columbia V6T 1Z2, Canada.

出版信息

Biometrics. 2002 Dec;58(4):878-87. doi: 10.1111/j.0006-341x.2002.00878.x.

Abstract

It is well known that imprecision in the measurement of predictor variables typically leads to bias in estimated regression coefficients. We compare the bias induced by measurement error in a continuous predictor with that induced by misclassification of a binary predictor in the contexts of linear and logistic regression. To make the comparison fair, we consider misclassification probabilities for a binary predictor that correspond to dichotomizing an imprecise continuous predictor in lieu of its precise counterpart. On this basis, nondifferential binary misclassification is seen to yield more bias than nondifferential continuous measurement error. However, it is known that differential misclassification results if a binary predictor is actually formed by dichotomizing a continuous predictor subject to nondifferential measurement error. When the postulated model linking the response and precise continuous predictor is correct, this differential misclassification is found to yield less bias than continuous measurement error, in contrast with nondifferential misclassification, i.e., dichotomization reduces the bias due to mismeasurement. This finding, however, is sensitive to the form of the underlying relationship between the response and the continuous predictor. In particular, we give a scenario where dichotomization involves a trade-off between model fit and misclassification bias. We also examine how the bias depends on the choice of threshold in the dichotomization process and on the correlation between the imprecise predictor and a second precise predictor.

摘要

众所周知,预测变量测量中的不精确通常会导致估计回归系数出现偏差。我们比较了连续预测变量测量误差所导致的偏差与二元预测变量在线性和逻辑回归背景下误分类所导致的偏差。为了使比较公平,我们考虑二元预测变量的误分类概率,这些概率对应于将不精确的连续预测变量二分以替代其精确对应变量。在此基础上,可以看出非差异性二元误分类比非差异性连续测量误差产生的偏差更大。然而,已知如果二元预测变量实际上是通过对存在非差异性测量误差的连续预测变量进行二分而形成的,就会产生差异性误分类。当将响应与精确连续预测变量联系起来的假设模型正确时,与非差异性误分类相比,这种差异性误分类产生的偏差比连续测量误差要小,即二分法减少了测量误差导致的偏差。然而,这一发现对响应与连续预测变量之间潜在关系的形式很敏感。特别是,我们给出了一个二分法涉及模型拟合与误分类偏差之间权衡的场景。我们还研究了偏差如何取决于二分法过程中阈值的选择以及不精确预测变量与第二个精确预测变量之间的相关性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验