Ioannidis John P A, Trikalinos Thomas A
Clinical Trials and Evidence Based Medicine Unit and Clinical and Molecular Epidemiology Unit, Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina, Greece.
Clin Trials. 2007;4(3):245-53. doi: 10.1177/1740774507079441.
The published clinical research literature may be distorted by the pursuit of statistically significant results.
We aimed to develop a test to explore biases stemming from the pursuit of nominal statistical significance.
The exploratory test evaluates whether there is a relative excess of formally significant findings in the published literature due to any reason (e.g., publication bias, selective analyses and outcome reporting, or fabricated data). The number of expected studies with statistically significant results is estimated and compared against the number of observed significant studies. The main application uses alpha = 0.05, but a range of alpha thresholds is also examined. Different values or prior distributions of the effect size are assumed. Given the typically low power (few studies per research question), the test may be best applied across domains of many meta-analyses that share common characteristics (interventions, outcomes, study populations, research environment).
We evaluated illustratively eight meta-analyses of clinical trials with >50 studies each and 10 meta-analyses of clinical efficacy for neuroleptic agents in schizophrenia; the 10 meta-analyses were also examined as a composite domain. Different results were obtained against commonly used tests of publication bias. We demonstrated a clear or possible excess of significant studies in 6 of 8 large meta-analyses and in the wide domain of neuroleptic treatments.
The proposed test is exploratory, may depend on prior assumptions, and should be applied cautiously.
An excess of significant findings may be documented in some clinical research fields.
已发表的临床研究文献可能会因追求具有统计学意义的结果而被扭曲。
我们旨在开发一种检验方法,以探究因追求名义上的统计学显著性而产生的偏差。
该探索性检验评估已发表文献中是否由于任何原因(例如发表偏倚、选择性分析和结果报告或伪造数据)而存在形式上显著结果的相对过量。估计具有统计学显著结果的预期研究数量,并与观察到的显著研究数量进行比较。主要应用采用α = 0.05,但也会检验一系列α阈值。假设效应大小有不同的值或先验分布。鉴于通常功效较低(每个研究问题的研究较少),该检验可能最适用于具有共同特征(干预措施、结果、研究人群、研究环境)的许多荟萃分析的领域。
我们示例性地评估了八项每项包含超过50项研究的临床试验荟萃分析,以及十项关于精神分裂症中抗精神病药物临床疗效的荟萃分析;这十项荟萃分析也作为一个综合领域进行了检验。与常用的发表偏倚检验相比,得到了不同的结果。我们在八项大型荟萃分析中的六项以及在抗精神病药物治疗的广泛领域中都证明了显著研究存在明显或可能的过量。
所提出的检验是探索性的,可能依赖于先验假设,应谨慎应用。
在一些临床研究领域可能会记录到显著结果的过量情况。