Petropoulou Maria, Mavridis Dimitris
Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina, Greece.
Department of Primary Education, University of Ioannina School of Education, Ioannina, Greece.
Stat Med. 2017 Nov 30;36(27):4266-4280. doi: 10.1002/sim.7431. Epub 2017 Aug 16.
When we synthesize research findings via meta-analysis, it is common to assume that the true underlying effect differs across studies. Total variability consists of the within-study and between-study variances (heterogeneity). There have been established measures, such as I , to quantify the proportion of the total variation attributed to heterogeneity. There is a plethora of estimation methods available for estimating heterogeneity. The widely used DerSimonian and Laird estimation method has been challenged, but knowledge of the overall performance of heterogeneity estimators is incomplete. We identified 20 heterogeneity estimators in the literature and evaluated their performance in terms of mean absolute estimation error, coverage probability, and length of the confidence interval for the summary effect via a simulation study. Although previous simulation studies have suggested the Paule-Mandel estimator, it has not been compared with all the available estimators. For dichotomous outcomes, estimating heterogeneity through Markov chain Monte Carlo is a good choice if an informative prior distribution for heterogeneity is employed (eg, by published Cochrane reviews). Nonparametric bootstrap and positive DerSimonian and Laird perform well for all assessment criteria for both dichotomous and continuous outcomes. Hartung-Makambi estimator can be the best choice when the heterogeneity values are close to 0.07 for dichotomous outcomes and medium heterogeneity values (0.01 , 0.05) for continuous outcomes. Hence, there are heterogeneity estimators (nonparametric bootstrap DerSimonian and Laird and positive DerSimonian and Laird) that perform better than the suggested Paule-Mandel. Maximum likelihood provides the best performance for both types of outcome in the absence of heterogeneity.
当我们通过荟萃分析来综合研究结果时,通常会假定各研究中真正的潜在效应是不同的。总变异性由研究内方差和研究间方差(异质性)组成。已经有一些既定的指标,如I²,来量化归因于异质性的总变异比例。有大量的估计方法可用于估计异质性。广泛使用的DerSimonian和Laird估计方法受到了挑战,但关于异质性估计量总体性能的知识并不完整。我们在文献中识别出20种异质性估计量,并通过模拟研究从平均绝对估计误差、覆盖概率以及汇总效应置信区间的长度方面评估了它们的性能。尽管先前的模拟研究推荐了Paule-Mandel估计量,但它尚未与所有可用的估计量进行比较。对于二分结局,如果采用关于异质性的信息性先验分布(例如通过已发表的Cochrane综述),通过马尔可夫链蒙特卡罗估计异质性是个不错的选择。非参数自助法以及正向DerSimonian和Laird法在二分结局和连续结局的所有评估标准上都表现良好。对于二分结局,当异质性值接近0.07,对于连续结局,当异质性值为中等水平(0.01, 0.05)时,Hartung-Makambi估计量可能是最佳选择。因此,存在一些比推荐的Paule-Mandel估计量表现更好的异质性估计量(非参数自助法DerSimonian和Laird法以及正向DerSimonian和Laird法)。在不存在异质性的情况下,最大似然法在两种类型结局上都提供了最佳性能。