Metsämuuronen Jari
Finnish Education Evaluation Centre, Helsinki, Finland.
Centre for Learning Analytics, University of Turku, Turku, Finland.
Front Psychol. 2022 Jul 18;13:891959. doi: 10.3389/fpsyg.2022.891959. eCollection 2022.
The reliability of a test score is discussed from the viewpoint of underestimation of and, specifically, deflation in estimates or reliability. Many widely used estimators are known to underestimate reliability. Empirical cases have shown that estimates by widely used estimators such as alpha, theta, omega, and rho may be deflated by up to 0.60 units of reliability or even more, with certain types of datasets. The reason for this radical deflation lies in the item-score correlation () embedded in the estimators: because the estimates by are deflated when the number of categories in scales are far from each other, as is always the case with item and score, the estimates of reliability are deflated as well. A short-cut method to reach estimates closer to the true magnitude, new types of estimators, and deflation-corrected estimators of reliability (DCERs), are studied in the article. The empirical section is a study on the characteristics of combinations of DCERs formed by different bases for estimators (alpha, theta, omega, and rho), different alternative estimators of correlation as the linking factor between item and the score variable, and different conditions. Based on the simulation, an initial typology of the families of DCERs is presented: some estimators are better with binary items and some with polytomous items; some are better with small sample sizes and some with larger ones.
本文从低估以及特别是估计值或信度的缩减的角度讨论了测试分数的信度。许多广泛使用的估计方法都已知会低估信度。实证案例表明,对于某些类型的数据集,诸如α、θ、ω和ρ等广泛使用的估计方法所得到的估计值可能会缩减多达0.60个信度单位甚至更多。这种大幅缩减的原因在于估计方法中所嵌入的项目分数相关性():由于当量表中的类别数量彼此相距甚远时,的估计值会缩减,而项目和分数的情况总是如此,信度的估计值也会随之缩减。本文研究了一种能得出更接近真实值的估计值的捷径方法、新型估计方法以及信度缩减校正估计方法(DCERs)。实证部分是一项关于由不同估计方法基础(α、θ、ω和ρ)、作为项目与分数变量之间联系因素的不同相关替代估计方法以及不同条件所形成的DCERs组合特征的研究。基于模拟,给出了DCERs族的初步类型划分:一些估计方法对二分项目效果更好,一些对多分类项目效果更好;一些对小样本量效果更好,一些对大样本量效果更好。