O'Brien P C, Fleming T R
Biometrics. 1979 Sep;35(3):549-56.
A multiple testing procedure is proposed for comparing two treatments when response to treatment is both dichotomous (i.e., success or failure) and immediate. The proposed test statistic for each test is the usual (Pearson) chi-square statistic based on all data collected to that point. The maximum number (N) of tests and the number (m1 + m2) of observations collected between successive tests is fixed in advance. The overall size of the procedure is shown to be controlled with virtually the same accuracy as the single sample chi-square test based on N(m1 + m2) observations. The power is also found to be virtually the same. However, by affording the opportunity to terminate early when one treatment performs markedly better than the other, the multiple testing procedure may eliminate the ethical dilemmas that often accompany clinical trials.
当治疗反应为二分法(即成功或失败)且为即时反应时,提出了一种用于比较两种治疗方法的多重检验程序。每次检验的拟用检验统计量是基于截至该点收集的所有数据的常用(皮尔逊)卡方统计量。检验的最大次数(N)以及连续检验之间收集的观察次数(m1 + m2)预先固定。该程序的总体规模显示出与基于N(m1 + m2)次观察的单样本卡方检验具有几乎相同的精度控制。功效也几乎相同。然而,通过提供当一种治疗方法明显优于另一种治疗方法时提前终止的机会,多重检验程序可能消除临床试验中经常伴随的伦理困境。