Dhamnetiya Deepak, Jha Ravi Prakash, Shalini Shalini, Bhattacharyya Krittika
Department of Community Medicine, Dr Baba Saheb Ambedkar Medical College and Hospital, Rohini, Delhi, India.
Lady Hardinge Medical College, Delhi, India.
J Lab Physicians. 2021 Sep 8;14(1):90-98. doi: 10.1055/s-0041-1734019. eCollection 2022 Mar.
Diagnostic tests are pivotal in modern medicine due to their applications in statistical decision-making regarding confirming or ruling out the presence of a disease in patients. In this regard, sensitivity and specificity are two most important and widely utilized components that measure the inherent validity of a diagnostic test for dichotomous outcomes against a gold standard test. Other diagnostic indices like positive predictive value, negative predictive value, positive likelihood ratio, negative likelihood ratio, accuracy of a diagnostic test, and the effect of prevalence on various diagnostic indices have also been discussed. We have tried to present the performance of a classification model at all classification thresholds by reviewing the receiver operating characteristic (ROC) curve and the depiction of the tradeoff between sensitivity and (1-specificity) across a series of cutoff points when the diagnostic test is on a continuous scale. The area under the ROC (AUROC) and comparison of AUROCs of different tests have also been discussed. Reliability of a test is defined in terms of the repeatability of the test such that the test gives consistent results when repeated more than once on the same individual or material, under the same conditions. In this article, we have presented the calculation of kappa coefficient, which is the simplest way of finding the agreement between two observers by calculating the overall percentage of agreement. When the prevalence of disease in the population is low, prospective study becomes increasingly difficult to handle through the conventional design. Hence, we chose to describe three more designs along with the conventional one and presented the sensitivity and specificity calculations for those designs. We tried to offer some guidance in choosing the best possible design among these four designs, depending on a number of factors. The ultimate aim of this article is to provide the basic conceptual framework and interpretation of various diagnostic test indices, ROC analysis, comparison of diagnostic accuracy of different tests, and the reliability of a test so that the clinicians can use it effectively. Several R packages, as mentioned in this article, can prove handy during quantitative synthesis of clinical data related to diagnostic tests.
诊断测试在现代医学中至关重要,因为它们在关于确认或排除患者疾病存在的统计决策中发挥着作用。在这方面,敏感性和特异性是衡量针对金标准测试的二分结果诊断测试固有有效性的两个最重要且广泛使用的组成部分。还讨论了其他诊断指标,如阳性预测值、阴性预测值、阳性似然比、阴性似然比、诊断测试的准确性以及患病率对各种诊断指标的影响。我们试图通过回顾接收者操作特征(ROC)曲线以及当诊断测试为连续尺度时在一系列截断点上敏感性与(1 - 特异性)之间权衡的描述,来展示分类模型在所有分类阈值下的性能。还讨论了ROC曲线下面积(AUROC)以及不同测试的AUROC比较。测试的可靠性是根据测试的可重复性来定义的,即当在相同条件下对同一个体或材料重复进行多次测试时,测试能给出一致的结果。在本文中,我们介绍了kappa系数的计算方法,这是通过计算总体一致百分比来找出两个观察者之间一致性的最简单方法。当人群中疾病患病率较低时,通过传统设计进行前瞻性研究变得越来越困难。因此,我们选择除了传统设计之外再描述三种设计,并给出这些设计的敏感性和特异性计算方法。我们试图根据一些因素,为在这四种设计中选择最佳可能设计提供一些指导。本文的最终目的是提供各种诊断测试指标、ROC分析、不同测试诊断准确性比较以及测试可靠性的基本概念框架和解释,以便临床医生能够有效地使用它。本文中提到的几个R包在与诊断测试相关的临床数据定量综合过程中可能会很有用。