t 检验、非参数检验与大型研究——统计实践中的悖论？

t-tests, non-parametric tests, and large studies--a paradox of statistical practice?

机构信息

Unit of Biostatistics and Epidemiology, Oslo University Hospital, Oslo, N-0407, Norway.

出版信息

BMC Med Res Methodol. 2012 Jun 14;12:78. doi: 10.1186/1471-2288-12-78.

DOI:10.1186/1471-2288-12-78

PMID:22697476

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3445820/

Abstract

BACKGROUND

During the last 30 years, the median sample size of research studies published in high-impact medical journals has increased manyfold, while the use of non-parametric tests has increased at the expense of t-tests. This paper explores this paradoxical practice and illustrates its consequences.

METHODS

A simulation study is used to compare the rejection rates of the Wilcoxon-Mann-Whitney (WMW) test and the two-sample t-test for increasing sample size. Samples are drawn from skewed distributions with equal means and medians but with a small difference in spread. A hypothetical case study is used for illustration and motivation.

RESULTS

The WMW test produces, on average, smaller p-values than the t-test. This discrepancy increases with increasing sample size, skewness, and difference in spread. For heavily skewed data, the proportion of p<0.05 with the WMW test can be greater than 90% if the standard deviations differ by 10% and the number of observations is 1000 in each group. The high rejection rates of the WMW test should be interpreted as the power to detect that the probability that a random sample from one of the distributions is less than a random sample from the other distribution is greater than 50%.

CONCLUSIONS

Non-parametric tests are most useful for small studies. Using non-parametric tests in large studies may provide answers to the wrong question, thus confusing readers. For studies with a large sample size, t-tests and their corresponding confidence intervals can and should be used even for heavily skewed data.

摘要

背景

在过去的 30 年中，发表在高影响力医学期刊上的研究论文的中位数样本量增加了许多倍，而非参数检验的使用则以牺牲 t 检验为代价而增加。本文探讨了这种自相矛盾的做法，并说明了其后果。

方法

使用模拟研究比较了 Wilcoxon-Mann-Whitney（WMW）检验和两样本 t 检验随着样本量增加的拒绝率。从具有相等均值和中位数但分布差异较小的偏态分布中抽取样本。使用一个假设的案例研究来说明和启发。

结果

WMW 检验的平均 p 值小于 t 检验。这种差异随着样本量、偏度和分布差异的增加而增加。对于严重偏态数据，如果每组的标准差相差 10%且观察值数为 1000，则 WMW 检验的 p<0.05 的比例可能大于 90%。WMW 检验的高拒绝率应解释为检测从一个分布中随机抽取的样本小于从另一个分布中随机抽取的样本的概率大于 50%的能力。

结论

非参数检验最适用于小样本研究。在大样本研究中使用非参数检验可能会为错误的问题提供答案，从而使读者感到困惑。对于样本量较大的研究，即使对于严重偏态数据，也可以并且应该使用 t 检验及其对应的置信区间。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

t 检验、非参数检验与大型研究——统计实践中的悖论？

t-tests, non-parametric tests, and large studies--a paradox of statistical practice?

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

t 检验、非参数检验与大型研究——统计实践中的悖论？

t-tests, non-parametric tests, and large studies--a paradox of statistical practice?

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献