Institute of Statistics, Ulm University, Ulm, Germany.
Technical University of Munich, Munich, Germany.
Psychometrika. 2018 Mar;83(1):203-222. doi: 10.1007/s11336-017-9601-x. Epub 2018 Jan 2.
The two-sample problem for Cronbach's coefficient [Formula: see text], as an estimate of test or composite score reliability, has attracted little attention compared to the extensive treatment of the one-sample case. It is necessary to compare the reliability of a test for different subgroups, for different tests or the short and long forms of a test. In this paper, we study statistical procedures of comparing two coefficients [Formula: see text] and [Formula: see text]. The null hypothesis of interest is [Formula: see text], which we test against one-or two-sided alternatives. For this purpose, resampling-based permutation and bootstrap tests are proposed for two-group multivariate non-normal models under the general asymptotically distribution-free (ADF) setting. These statistical tests ensure a better control of the type-I error, in finite or very small sample sizes, when the state-of-affairs ADF large-sample test may fail to properly attain the nominal significance level. By proper choice of a studentized test statistic, the resampling tests are modified in order to be valid asymptotically even in non-exchangeable data frameworks. Moreover, extensions of this approach to other designs and reliability measures are discussed as well. Finally, the usefulness of the proposed resampling-based testing strategies is demonstrated in an extensive simulation study and illustrated by real data applications.
克朗巴赫系数 [Formula: see text] 的两样本问题作为测试或综合评分可靠性的估计值,与一样本情况的广泛处理相比,受到的关注较少。需要比较不同子组、不同测试或测试的短形式和长形式的测试的可靠性。在本文中,我们研究了比较两个系数 [Formula: see text] 和 [Formula: see text] 的统计程序。感兴趣的零假设是 [Formula: see text],我们针对单侧或双侧替代方案对其进行检验。为此,针对一般渐近分布自由 (ADF) 环境下的两组多变量非正态模型,提出了基于重采样的置换和引导检验。这些统计检验在 ADF 大样本检验可能无法正确达到名义显著性水平的有限或非常小样本量时,确保更好地控制第一类错误。通过适当选择学生化检验统计量,重采样检验被修改为即使在不可交换数据框架中也能渐近有效。此外,还讨论了将这种方法扩展到其他设计和可靠性度量的问题。最后,通过广泛的模拟研究和实际数据应用,证明了所提出的基于重采样的检验策略的有用性。