Schulzer M
Department of Medicine, University of British Columbia, Vancouver, Canada.
Muscle Nerve. 1994 Jul;17(7):815-9. doi: 10.1002/mus.880170719.
Common measures of the accuracy of diagnostic tests are reviewed. It is shown that the actual performance (predictive value) of these tests depends not only on their sensitivity and specificity, but also on the prevalence of the disease in the population tested (Bayes' theorem). The effect of an inaccurate "gold standard" on the calibration of a new diagnostic test is discussed. Receiver operating characteristic (ROC) curves are introduced as a tool for selecting an optimal cutpoint for a test, and for comparing different tests. Schemes are given for combining tests to improve their accuracy. When multiple continuous measurements are available, methods of discriminant analysis (and logistic regression) are shown to provide measurement combinations with improved accuracy. Examples and key references are provided.
本文回顾了诊断试验准确性的常用测量方法。结果表明,这些试验的实际性能(预测值)不仅取决于其敏感性和特异性,还取决于所检测人群中疾病的患病率(贝叶斯定理)。讨论了不准确的“金标准”对新诊断试验校准的影响。引入了受试者操作特征(ROC)曲线,作为选择试验最佳切点和比较不同试验的工具。给出了组合试验以提高其准确性的方案。当有多个连续测量值时,判别分析(和逻辑回归)方法可提供准确性更高的测量组合。文中提供了示例和关键参考文献。