Putter H, Fiocco M, Stijnen T
Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, The Netherlands.
Biom J. 2010 Feb;52(1):95-110. doi: 10.1002/bimj.200900073.
Diagnostic tests play an important role in clinical practice. The objective of a diagnostic test accuracy study is to compare an experimental diagnostic test with a reference standard. The majority of these studies dichotomize test results into two categories: negative and positive. But often the underlying test results may be categorized into more than two, ordered, categories. This article concerns the situation where multiple studies have evaluated the same diagnostic test with the same multiple thresholds in a population of non-diseased and diseased individuals. Recently, bivariate meta-analysis has been proposed for the pooling of sensitivity and specificity, which are likely to be negatively correlated within studies. These ideas have been extended to the situation of diagnostic tests with multiple thresholds, leading to a multinomial model with multivariate normal between-study variation. This approach is efficient, but computer-intensive and its convergence is highly dependent on starting values. Moreover, monotonicity of the sensitivities/specificities for increasing thresholds is not guaranteed. Here, we propose a Poisson-correlated gamma frailty model, previously applied to a seemingly quite different situation, meta-analysis of paired survival curves. Since the approach is based on hazards, it guarantees monotonicity of the sensitivities/specificities for increasing thresholds. The approach is less efficient than the multinomial/normal approach. On the other hand, the Poisson-correlated gamma frailty model makes no assumptions on the relationship between sensitivity and specificity, gives consistent results, appears to be quite robust against different between-study variation models, and is computationally very fast and reliable with regard to the overall sensitivities/specificities.
诊断测试在临床实践中发挥着重要作用。诊断测试准确性研究的目的是将一种实验性诊断测试与参考标准进行比较。这些研究大多将测试结果分为两类:阴性和阳性。但通常基础测试结果可能会被分为两个以上的有序类别。本文关注的是在非患病和患病个体群体中,多项研究使用相同的多个阈值评估同一项诊断测试的情况。最近,有人提出了双变量荟萃分析来合并敏感性和特异性,而在各项研究中它们可能呈负相关。这些想法已扩展到具有多个阈值的诊断测试情况,从而产生了一个在研究间变异服从多元正态分布的多项模型。这种方法效率很高,但计算量很大,并且其收敛高度依赖于初始值。此外,对于阈值增加时敏感性/特异性的单调性无法保证。在此,我们提出一种泊松相关伽马脆弱模型,该模型先前应用于看似截然不同的情况,即配对生存曲线的荟萃分析。由于该方法基于风险,它保证了对于阈值增加时敏感性/特异性的单调性。该方法比多项/正态方法效率低。另一方面,泊松相关伽马脆弱模型对敏感性和特异性之间的关系不做任何假设,给出一致的结果,对于不同的研究间变异模型似乎相当稳健,并且在总体敏感性/特异性方面计算非常快速且可靠。