Department of Statistics, Florida State University, Tallahassee, FL, USA.
Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA.
J Gen Intern Med. 2018 Aug;33(8):1260-1267. doi: 10.1007/s11606-018-4425-7. Epub 2018 Apr 16.
Decision makers rely on meta-analytic estimates to trade off benefits and harms. Publication bias impairs the validity and generalizability of such estimates. The performance of various statistical tests for publication bias has been largely compared using simulation studies and has not been systematically evaluated in empirical data.
This study compares seven commonly used publication bias tests (i.e., Begg's rank test, trim-and-fill, Egger's, Tang's, Macaskill's, Deeks', and Peters' regression tests) based on 28,655 meta-analyses available in the Cochrane Library.
Egger's regression test detected publication bias more frequently than other tests (15.7% in meta-analyses of binary outcomes and 13.5% in meta-analyses of non-binary outcomes). The proportion of statistically significant publication bias tests was greater for larger meta-analyses, especially for Begg's rank test and the trim-and-fill method. The agreement among Tang's, Macaskill's, Deeks', and Peters' regression tests for binary outcomes was moderately strong (most κ's were around 0.6). Tang's and Deeks' tests had fairly similar performance (κ > 0.9). The agreement among Begg's rank test, the trim-and-fill method, and Egger's regression test was weak or moderate (κ < 0.5).
Given the relatively low agreement between many publication bias tests, meta-analysts should not rely on a single test and may apply multiple tests with various assumptions. Non-statistical approaches to evaluating publication bias (e.g., searching clinical trials registries, records of drug approving agencies, and scientific conference proceedings) remain essential.
决策者依赖于荟萃分析估计来权衡收益和危害。发表偏倚会损害此类估计的有效性和普遍性。各种用于发表偏倚的统计检验的性能在很大程度上是通过模拟研究进行比较的,而在实证数据中并没有得到系统评估。
本研究比较了七种常用的发表偏倚检验(即 Begg 秩检验、修剪填充法、Egger 检验、Tang 检验、Macaskill 检验、Deeks 检验和 Peters 回归检验),这些检验基于 Cochrane 图书馆中可用的 28655 项荟萃分析。
Egger 回归检验比其他检验更频繁地检测到发表偏倚(二分类结局荟萃分析中为 15.7%,非二分类结局荟萃分析中为 13.5%)。对于较大的荟萃分析,统计显著的发表偏倚检验比例更高,尤其是 Begg 秩检验和修剪填充法。对于二分类结局,Tang 检验、Macaskill 检验、Deeks 检验和 Peters 回归检验之间的一致性为中度强(大多数 κ 值在 0.6 左右)。Tang 检验和 Deeks 检验的性能相当(κ>0.9)。Begg 秩检验、修剪填充法和 Egger 回归检验之间的一致性较弱或中等(κ<0.5)。
鉴于许多发表偏倚检验之间的相对低一致性,荟萃分析人员不应依赖于单个检验,而可以应用具有不同假设的多个检验。评估发表偏倚的非统计方法(例如,搜索临床试验注册处、药物批准机构的记录和科学会议记录)仍然是必不可少的。