Institute of Medical Biometry and Informatics, University of Heidelberg, Germany.
Br J Math Stat Psychol. 2011 Nov;64(3):410-26. doi: 10.1348/2044-8317.002003. Epub 2010 Dec 6.
Student's one-sample t-test is a commonly used method when inference about the population mean is made. As advocated in textbooks and articles, the assumption of normality is often checked by a preliminary goodness-of-fit (GOF) test. In a paper recently published by Schucany and Ng it was shown that, for the uniform distribution, screening of samples by a pretest for normality leads to a more conservative conditional Type I error rate than application of the one-sample t-test without preliminary GOF test. In contrast, for the exponential distribution, the conditional level is even more elevated than the Type I error rate of the t-test without pretest. We examine the reasons behind these characteristics. In a simulation study, samples drawn from the exponential, lognormal, uniform, Student's t-distribution with 2 degrees of freedom (t(2) ) and the standard normal distribution that had passed normality screening, as well as the ingredients of the test statistics calculated from these samples, are investigated. For non-normal distributions, we found that preliminary testing for normality may change the distribution of means and standard deviations of the selected samples as well as the correlation between them (if the underlying distribution is non-symmetric), thus leading to altered distributions of the resulting test statistics. It is shown that for skewed distributions the excess in Type I error rate may be even more pronounced when testing one-sided hypotheses.
学生的单样本 t 检验是在对总体均值进行推断时常用的方法。正如教材和文章所提倡的,通常通过初步拟合优度(GOF)检验来检查正态性假设。在最近由 Schucany 和 Ng 发表的一篇论文中,已经表明,对于均匀分布,通过预测试对样本进行正态性筛选会导致条件型 I 错误率比不进行初步 GOF 检验的单样本 t 检验更保守。相比之下,对于指数分布,条件水平甚至比没有预测试的 t 检验的型 I 错误率更高。我们研究了这些特征背后的原因。在一项模拟研究中,我们研究了从指数分布、对数正态分布、均匀分布、自由度为 2 的学生 t 分布(t(2))和标准正态分布中抽取的样本,这些样本通过了正态性筛选,以及从这些样本计算得出的检验统计量的成分。对于非正态分布,我们发现,对正态性的初步检验可能会改变所选样本的均值和标准差的分布以及它们之间的相关性(如果基础分布是非对称的),从而导致检验统计量的分布发生变化。结果表明,对于偏态分布,在单侧假设检验时,I 型错误率的过剩可能更为明显。