Brennan P, Croft P
ARC Epidemiological Research Unit, University of Manchester Medical School.
BMJ. 1994 Sep 17;309(6956):727-30. doi: 10.1136/bmj.309.6956.727.
In a randomised controlled trial, if the design is not flawed, different outcomes in the study groups must be due to the intervention itself or to chance imbalances between the groups. Because of this tests of statistical significance are used to assess the validity of results from randomised studies. Most published papers in medical research, however, describe observational studies which do not include randomised intervention. This paper argues that the continuing application of tests of significance to such non-randomised investigations is inappropriate. It draws a distinction between bias and chance imbalance on the one hand (both randomised and observational studies can be affected) and confounding on the other (a unique problem for observational investigations). It concludes that neither the P value nor the 95% confidence interval should be used as evidence for the validity of an observational result.
在一项随机对照试验中,如果设计没有缺陷,研究组之间不同的结果必定是由于干预本身或组间的随机失衡。因此,显著性统计检验用于评估随机研究结果的有效性。然而,医学研究中大多数已发表的论文描述的是不包括随机干预的观察性研究。本文认为,将显著性检验继续应用于此类非随机研究是不合适的。它区分了一方面的偏差和随机失衡(随机研究和观察性研究都可能受到影响)与另一方面的混杂因素(观察性研究独有的问题)。它得出结论,P值和95%置信区间都不应被用作观察性结果有效性的证据。