Mittlböck M, Heinzl H
Core Unit for Medical Statistics and Informatics, Medical University of Vienna, Austria.
Stat Med. 2006 Dec 30;25(24):4321-33. doi: 10.1002/sim.2692.
The assessment of heterogeneity or between-study variance is an important issue in meta-analysis. It determines the statistical methods to be used and the interpretation of the results. Tests of heterogeneity may be misleading either due to low power for sparse data or to the detection of irrelevant amounts of heterogeneity when many studies are involved. In the former case, notable heterogeneity may remain unconsidered and an unsuitable model may be chosen and the latter case may lead to unnecessary complex analyses strategies. Measures of heterogeneity are better suited to determine appropriate analyses strategies. We review two measures with different scaling and compare them with the heterogeneity test. Estimates of the within-study variance are discussed and a new total information measure is introduced. Various properties of the quantities in question are assessed by a simulation study. Heterogeneity test and measures are not directly related to the amount of between-study variance but to the relative increase of variance due to heterogeneity. It is more favourable to base the within-study variance estimate on the squared weights of individual studies than on the sum of weights. A heterogeneity measure scaled to a fixed interval needs reference values for proper interpretation. A measure defined by the relation of between- to within-study variance has a more natural interpretation but no upper limit. Both measures are quantifications of the impact of heterogeneity on the meta-analysis result as both depend on the variance of the individual study effects and thus on the number of patients in the studies.
异质性或研究间方差的评估是荟萃分析中的一个重要问题。它决定了要使用的统计方法以及结果的解释。异质性检验可能会产生误导,这要么是因为稀疏数据的检验效能低,要么是因为在涉及众多研究时检测到了无关的异质性量。在前一种情况下,显著的异质性可能会被忽视,从而可能选择不合适的模型;而在后一种情况下,可能会导致不必要的复杂分析策略。异质性度量更适合于确定合适的分析策略。我们回顾了两种具有不同标度的度量,并将它们与异质性检验进行比较。讨论了研究内方差的估计,并引入了一种新的总信息度量。通过模拟研究评估了相关量的各种性质。异质性检验和度量并非直接与研究间方差的量相关,而是与由于异质性导致的方差相对增加相关。基于各个研究的平方权重而非权重总和来估计研究内方差更为有利。缩放到固定区间的异质性度量需要参考值才能进行恰当解释。由研究间方差与研究内方差的关系定义的度量具有更自然的解释,但没有上限。这两种度量都是异质性对荟萃分析结果影响的量化,因为它们都依赖于各个研究效应的方差,从而依赖于研究中的患者数量。