Schneck Andreas
Department of Sociology, Ludwig-Maximilians-Universität München, Munich, Germany.
PeerJ. 2017 Nov 30;5:e4115. doi: 10.7717/peerj.4115. eCollection 2017.
Publication bias is a form of scientific misconduct. It threatens the validity of research results and the credibility of science. Although several tests on publication bias exist, no in-depth evaluations are available that examine which test performs best for different research settings.
Four tests on publication bias, Egger's test (FAT), p-uniform, the test of excess significance (TES), as well as the caliper test, were evaluated in a Monte Carlo simulation. Two different types of publication bias and its degree (0%, 50%, 100%) were simulated. The type of publication bias was defined either as , meaning the repeated analysis of new datasets, or , meaning the inclusion of covariates in order to obtain a significant result. In addition, the underlying effect ( = 0, 0.5, 1, 1.5), effect heterogeneity, the number of observations in the simulated primary studies ( = 100, 500), and the number of observations for the publication bias tests ( = 100, 1,000) were varied.
All tests evaluated were able to identify publication bias both in the and condition. The false positive rates were, with the exception of the 15%- and 20%-caliper test, unbiased. The FAT had the largest statistical power in the conditions, whereas under the TES was, except under effect heterogeneity, slightly better. The CTs were, however, inferior to the other tests under effect homogeneity and had a decent statistical power only in conditions with 1,000 primary studies.
The FAT is recommended as a test for publication bias in standard meta-analyses with no or only small effect heterogeneity. If two-sided publication bias is suspected as well as under the TES is the first alternative to the FAT. The 5%-caliper test is recommended under conditions of effect heterogeneity and a large number of primary studies, which may be found if publication bias is examined in a discipline-wide setting when primary studies cover different research problems.
发表偏倚是一种科学不端行为。它威胁到研究结果的有效性和科学的可信度。尽管存在多种发表偏倚检验方法,但尚无深入评估来考察哪种检验方法在不同研究背景下表现最佳。
在蒙特卡洛模拟中对四种发表偏倚检验方法进行了评估,即埃格检验(FAT)、p-均匀性检验、过度显著性检验(TES)以及卡尺检验。模拟了两种不同类型的发表偏倚及其程度(0%、50%、100%)。发表偏倚的类型定义为 ,即对新数据集进行重复分析,或 ,即纳入协变量以获得显著结果。此外,还改变了潜在效应( = 0、0.5、1、1.5)、效应异质性、模拟的主要研究中的观察数量( = 100、500)以及发表偏倚检验的观察数量( = 100、1000)。
所有评估的检验方法在 和 条件下都能够识别发表偏倚。除了15%和20%卡尺检验外,假阳性率无偏倚。FAT在 条件下具有最大的统计功效,而在 条件下,除了效应异质性情况外,TES略胜一筹。然而,在效应同质性条件下,卡尺检验不如其他检验方法,并且仅在有1000项主要研究的情况下具有相当的统计功效。
对于无效应异质性或仅有微小效应异质性的标准荟萃分析,推荐使用FAT作为发表偏倚检验方法。如果怀疑存在双侧发表偏倚以及在 条件下,TES是FAT的首选替代方法。在效应异质性和大量主要研究的条件下,推荐使用5%卡尺检验,这种情况可能出现在跨学科研究中检验发表偏倚时,此时主要研究涵盖不同的研究问题。