Cook K F, Dodd B G, Fitzpatrick S J
Baylor College of Medicine/Veterans Affairs, Houston, Texas, USA.
J Outcome Meas. 1999;3(1):1-20.
An alternative to dichotomous scoring of multiple items anchored to a common stem is scoring these items as a single polytomous item (testlet scoring). This study systematically compared the partial credit model (PCM), the generalized partial credit model (GPCM), and the graded response model (GRM) in the context of testlet scoring. Data sets included a sample from the fall 1994 administration of the SAT I (N = 2,548) and a simulated data set. Theta estimation, information, and model fit were analyzed. Correlations among theta estimates ranged from 0.9748 to 0.9921. The relationship among the information functions of the PCM, GPCM and the GRM reflected the discrimination parameter estimates for the latter two models. Suggestions are made with regard to model selection.
将锚定在共同题干上的多个项目进行二分计分的一种替代方法是将这些项目作为单个多分类项目进行计分(题组计分)。本研究在题组计分的背景下系统地比较了部分计分模型(PCM)、广义部分计分模型(GPCM)和等级反应模型(GRM)。数据集包括1994年秋季SAT I考试的一个样本(N = 2548)和一个模拟数据集。分析了θ估计、信息量和模型拟合情况。θ估计之间的相关性在0.9748至0.9921之间。PCM、GPCM和GRM的信息函数之间的关系反映了后两个模型的区分度参数估计。针对模型选择提出了建议。