Nam Jun-mo
Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Department of Health & Human Services, Executive Plaza South, Rockville, MD 20892-7240, USA.
Stat Med. 2006 May 15;25(9):1521-31. doi: 10.1002/sim.2321.
In this paper, we assess the performance of homogeneity tests for two or more kappa statistics when prevalence rates across reliability studies are assumed to be equal. The likelihood score method and the chi-square goodness-of-fit (GOF) test provide type 1 error rates that are satisfactorily close to the nominal level, but a Fleiss-like test is not satisfactory for small or moderate sample sizes. Simulations show that the score test is more powerful than the chi-square GOF test and the approximate sample size required for a specific power of the former is substantially smaller than the latter. In addition, the score test is robust to deviations from the equal prevalence assumption, while the GOF test is highly sensitive and it may give a grossly misleading type 1 error rate when the assumption of equal prevalence is violated. We conclude that the homogeneity score test is the preferred method.
在本文中,我们评估了在假设可靠性研究中的患病率相等的情况下,针对两个或更多kappa统计量的同质性检验的性能。似然比分法和卡方拟合优度(GOF)检验所提供的一类错误率令人满意地接近名义水平,但对于小样本或中等样本量,类似Fleiss检验的方法并不令人满意。模拟表明,比分检验比卡方GOF检验更具功效,并且对于前者达到特定功效所需的近似样本量远小于后者。此外,比分检验对于患病率相等假设的偏差具有稳健性,而GOF检验高度敏感,当违反患病率相等的假设时,它可能给出极具误导性的一类错误率。我们得出结论,同质性比分检验是首选方法。