University of Nevada, Reno.
J Exp Anal Behav. 2019 Mar;111(2):329-341. doi: 10.1002/jeab.501. Epub 2019 Jan 30.
Randomization tests are a class of nonparametric statistics that determine the significance of treatment effects. Unlike parametric statistics, randomization tests do not assume a random sample, or make any of the distributional assumptions that often preclude statistical inferences about single-case data. A feature that randomization tests share with parametric statistics, however, is the derivation of a p-value. P-values are notoriously misinterpreted and are partly responsible for the putative "replication crisis." Behavior analysts might question the utility of adding such a controversial index of statistical significance to their methods, so it is the aim of this paper to describe the randomization test logic and its potentially beneficial consequences. In doing so, this paper will: (1) address the replication crisis as a behavior analyst views it, (2) differentiate the problematic p-values of parametric statistics from the, arguably, more useful p-values of randomization tests, and (3) review the logic of randomization tests and their unique fit within the behavior analytic tradition of studying behavioral processes that cut across species.
随机化检验是一类非参数统计学方法,用于确定处理效应的显著性。与参数统计学不同,随机化检验不假设随机样本,也不做任何通常会排除关于单案例数据的统计推断的分布假设。然而,随机化检验与参数统计学的一个共同特征是 p 值的推导。p 值被广泛误解,是所谓的“复制危机”的部分原因。行为分析师可能会质疑在他们的方法中添加这样一个有争议的统计显著性指标的效用,因此本文的目的是描述随机化检验的逻辑及其潜在的有益后果。为此,本文将:(1) 从行为分析师的角度解决复制危机;(2) 区分参数统计学中存在问题的 p 值和随机化检验中更有用的 p 值;(3) 回顾随机化检验的逻辑及其在行为分析传统中的独特适用性,该传统研究跨越物种的行为过程。