Department of Biostatistics and Computational Biology, University of Rochester, Rochester, NY 14642, USA.
Biostatistics. 2012 Jan;13(1):32-47. doi: 10.1093/biostatistics/kxr020. Epub 2011 Aug 18.
Sensitivity and specificity are common measures of the accuracy of a diagnostic test. The usual estimators of these quantities are unbiased if data on the diagnostic test result and the true disease status are obtained from all subjects in an appropriately selected sample. In some studies, verification of the true disease status is performed only for a subset of subjects, possibly depending on the result of the diagnostic test and other characteristics of the subjects. Estimators of sensitivity and specificity based on this subset of subjects are typically biased; this is known as verification bias. Methods have been proposed to correct verification bias under the assumption that the missing data on disease status are missing at random (MAR), that is, the probability of missingness depends on the true (missing) disease status only through the test result and observed covariate information. When some of the covariates are continuous, or the number of covariates is relatively large, the existing methods require parametric models for the probability of disease or the probability of verification (given the test result and covariates), and hence are subject to model misspecification. We propose a new method for correcting verification bias based on the propensity score, defined as the predicted probability of verification given the test result and observed covariates. This is estimated separately for those with positive and negative test results. The new method classifies the verified sample into several subsamples that have homogeneous propensity scores and allows correction for verification bias. Simulation studies demonstrate that the new estimators are more robust to model misspecification than existing methods, but still perform well when the models for the probability of disease and probability of verification are correctly specified.
敏感度和特异性是诊断试验准确性的常用衡量指标。如果从适当选择的样本中的所有受试者中获得诊断试验结果和真实疾病状态的数据,则这些量的常用估计量是无偏的。在某些研究中,仅对一部分受试者进行了真实疾病状态的验证,这可能取决于诊断试验的结果和受试者的其他特征。基于该子组的受试者的敏感度和特异性的估计值通常是有偏的,这被称为验证偏倚。已经提出了一些方法来纠正验证偏倚,假设缺失的疾病状态数据是随机缺失的(MAR),也就是说,缺失的可能性仅通过测试结果和观察到的协变量信息取决于真实(缺失)疾病状态。当某些协变量是连续的,或者协变量的数量相对较大时,现有的方法需要对疾病概率或验证概率(给定测试结果和协变量)进行参数模型,因此容易出现模型不恰当。我们提出了一种基于倾向得分的新方法来纠正验证偏倚,倾向得分定义为给定测试结果和观察到的协变量的验证概率。对于阳性和阴性测试结果,分别对其进行估计。该新方法将经过验证的样本分为具有相似倾向得分的几个子样本,并允许对验证偏倚进行校正。模拟研究表明,与现有方法相比,新估计量对模型不恰当的鲁棒性更强,但在疾病概率和验证概率的模型正确指定时仍能很好地执行。