Higgins Julian P T, Thompson Simon G
MRC Biostatistics Unit, Institute of Public Health, Robinson Way, Cambridge CB2 2SR, UK.
Stat Med. 2002 Jun 15;21(11):1539-58. doi: 10.1002/sim.1186.
The extent of heterogeneity in a meta-analysis partly determines the difficulty in drawing overall conclusions. This extent may be measured by estimating a between-study variance, but interpretation is then specific to a particular treatment effect metric. A test for the existence of heterogeneity exists, but depends on the number of studies in the meta-analysis. We develop measures of the impact of heterogeneity on a meta-analysis, from mathematical criteria, that are independent of the number of studies and the treatment effect metric. We derive and propose three suitable statistics: H is the square root of the chi2 heterogeneity statistic divided by its degrees of freedom; R is the ratio of the standard error of the underlying mean from a random effects meta-analysis to the standard error of a fixed effect meta-analytic estimate, and I2 is a transformation of (H) that describes the proportion of total variation in study estimates that is due to heterogeneity. We discuss interpretation, interval estimates and other properties of these measures and examine them in five example data sets showing different amounts of heterogeneity. We conclude that H and I2, which can usually be calculated for published meta-analyses, are particularly useful summaries of the impact of heterogeneity. One or both should be presented in published meta-analyses in preference to the test for heterogeneity.
在荟萃分析中,异质性的程度在一定程度上决定了得出总体结论的难度。这种程度可以通过估计研究间方差来衡量,但随后的解释是特定于特定的治疗效果指标的。存在一种异质性存在的检验方法,但它取决于荟萃分析中的研究数量。我们从数学标准出发,开发了一些衡量异质性对荟萃分析影响的指标,这些指标与研究数量和治疗效果指标无关。我们推导并提出了三个合适的统计量:H是卡方异质性统计量除以其自由度后的平方根;R是随机效应荟萃分析中潜在均值的标准误与固定效应荟萃分析估计值的标准误之比,I2是(H)的一种变换,它描述了研究估计值中由于异质性导致的总变异比例。我们讨论了这些指标的解释、区间估计和其他特性,并在五个显示不同异质性程度的示例数据集中对它们进行了检验。我们得出结论,H和I2通常可以为已发表的荟萃分析计算出来,它们是异质性影响的特别有用的总结。在已发表的荟萃分析中,应优先呈现其中一个或两个指标,而不是异质性检验。