Krueger Joachim I, Heck Patrick R
Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, ProvidenceRI, United States.
Front Psychol. 2017 Jun 9;8:908. doi: 10.3389/fpsyg.2017.00908. eCollection 2017.
Many statistical methods yield the probability of the observed data - or data more extreme - under the assumption that a particular hypothesis is true. This probability is commonly known as 'the' -value. (Null Hypothesis) Significance Testing ([NH]ST) is the most prominent of these methods. The -value has been subjected to much speculation, analysis, and criticism. We explore how well the -value predicts what researchers presumably seek: the probability of the hypothesis being true given the evidence, and the probability of reproducing significant results. We also explore the effect of sample size on inferential accuracy, bias, and error. In a series of simulation experiments, we find that the -value performs quite well as a heuristic cue in inductive inference, although there are identifiable limits to its usefulness. We conclude that despite its general usefulness, the -value cannot bear the full burden of inductive inference; it is but one of several heuristic cues available to the data analyst. Depending on the inferential challenge at hand, investigators may supplement their reports with effect size estimates, Bayes factors, or other suitable statistics, to communicate what they think the data say.
许多统计方法在特定假设为真的前提下,得出观测数据或更极端数据出现的概率。这个概率通常被称为“P值”。(零假设)显著性检验([NH]ST)是这些方法中最突出的一种。P值一直备受猜测、分析和批评。我们探讨P值在预测研究人员可能想要了解的内容方面的表现如何:即给定证据时假设为真的概率,以及重现显著结果的概率。我们还探讨样本量对推理准确性、偏差和误差的影响。在一系列模拟实验中,我们发现尽管P值的有用性存在可识别的局限性,但它在归纳推理中作为一种启发式线索表现相当不错。我们得出结论,尽管P值总体上有用,但它不能承担归纳推理的全部重任;它只是数据分析师可用的几种启发式线索之一。根据手头的推理挑战,研究人员可以在报告中补充效应量估计、贝叶斯因子或其他合适的统计量,以传达他们认为数据所表明的内容。