评估不同形式差异水平下的等值方法。

Evaluating Equating Methods for Varying Levels of Form Difference.

作者信息

Sun Ting, Kim Stella Yun

机构信息

University of Utah, Salt Lake City, USA.

University of North Carolina at Charlotte, USA.

出版信息

Educ Psychol Meas. 2024 Jun;84(3):510-529. doi: 10.1177/00131644231176989. Epub 2023 Jun 8.

DOI:10.1177/00131644231176989

PMID:38756465

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11095324/

Abstract

Equating is a statistical procedure used to adjust for the difference in form difficulty such that scores on those forms can be used and interpreted comparably. In practice, however, equating methods are often implemented without considering the extent to which two forms differ in difficulty. The study aims to examine the effect of the magnitude of a form difficulty difference on equating results under random group (RG) and common-item nonequivalent group (CINEG) designs. Specifically, this study evaluates the performance of six equating methods under a set of simulation conditions including varying levels of form difference. Results revealed that, under the RG design, mean equating was proven to be the most accurate method when there is no or small form difference, whereas equipercentile is the most accurate method when the difficulty difference is medium or large. Under the CINEG design, Tucker Linear was found to be the most accurate method when the difficulty difference is medium or small, and either chained equipercentile or frequency estimation is preferred with a large difficulty level. This study would provide practitioners with research evidence-based guidance in the choice of equating methods with varying levels of form difference. As the condition of no form difficulty difference is also included, this study would inform testing companies of appropriate equating methods when two forms are similar in difficulty level.

摘要

等值化是一种统计程序，用于调整试卷形式难度的差异，以便能够以可比的方式使用和解释这些试卷上的分数。然而，在实际操作中，实施等值化方法时往往没有考虑两种试卷在难度上的差异程度。本研究旨在考察试卷难度差异幅度对随机组（RG）设计和共同题目非等组（CINEG）设计下等值化结果的影响。具体而言，本研究在包括不同试卷差异水平的一组模拟条件下，评估了六种等值化方法的性能。结果显示，在RG设计下，当试卷差异不存在或较小时，均值等值化被证明是最准确的方法，而当难度差异为中等或较大时，等百分位等值化是最准确的方法。在CINEG设计下，当难度差异为中等或较小时，发现塔克线性等值化是最准确的方法，而当难度水平较大时，链式等百分位等值化或频率估计法更受青睐。本研究将为从业者在选择不同试卷差异水平的等值化方法时提供基于研究证据的指导。由于也包括了不存在试卷难度差异的情况，本研究将告知测试公司在两种试卷难度水平相似时适用的等值化方法。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

评估不同形式差异水平下的等值方法。

Evaluating Equating Methods for Varying Levels of Form Difference.

作者信息

机构信息

出版信息

相似文献

相似文献

评估不同形式差异水平下的等值方法。

Evaluating Equating Methods for Varying Levels of Form Difference.

作者信息

机构信息

出版信息