Brennan Robert L, Kim Stella Y, Lee Won-Chan
The University of Iowa, Iowa City, USA.
The University of North Carolina at Charlotte, USA.
Educ Psychol Meas. 2022 Aug;82(4):617-642. doi: 10.1177/00131644211049746. Epub 2021 Nov 14.
This article extends multivariate generalizability theory (MGT) to tests with different random-effects designs for each level of a fixed facet. There are numerous situations in which the design of a test and the resulting data structure are not definable by a single design. One example is mixed-format tests that are composed of multiple-choice and free-response items, with the latter involving variability attributable to both items and raters. In this case, two distinct designs are needed to fully characterize the design and capture potential sources of error associated with each item format. Another example involves tests containing both testlets and one or more stand-alone sets of items. Testlet effects need to be taken into account for the testlet-based items, but not the stand-alone sets of items. This article presents an extension of MGT that faithfully models such complex test designs, along with two real-data examples. Among other things, these examples illustrate that estimates of error variance, error-tolerance ratios, and reliability-like coefficients can be biased if there is a mismatch between the user-specified universe of generalization and the complex nature of the test.
本文将多变量概化理论(MGT)扩展到针对固定侧面每个水平具有不同随机效应设计的测试。在许多情况下,测试的设计以及由此产生的数据结构无法由单一设计来定义。一个例子是混合格式测试,它由多项选择题和自由回答题组成,后者涉及可归因于题目和评分者的变异性。在这种情况下,需要两种不同的设计来全面描述该设计并捕捉与每种题目格式相关的潜在误差来源。另一个例子涉及包含题组和一个或多个独立题目集的测试。对于基于题组的题目需要考虑题组效应,但对于独立题目集则不需要。本文提出了MGT的一种扩展,它能如实地对这种复杂的测试设计进行建模,并给出了两个实际数据示例。这些示例尤其表明,如果用户指定的概化全域与测试的复杂性质不匹配,误差方差、容错率和类可靠性系数的估计可能会有偏差。