Miller G Edward, Rotou Ourania, Twing Jon S
Division of Student Assessment, Texas Education Agency, 1701 North Congress Avenue, Austin, TX 78701-1494, USA.
J Appl Meas. 2004;5(2):172-7.
A number of state assessment programs that employ Rasch-based common item equating procedures estimate the equating constant with only those common items for which the two tests' Rasch item difficulty parameter estimates differ by less than 0.3 logits. The results of this study presents evidence that this practice results in an inflated probability of incorrectly dropping an item from the common item set if the number of examinees is small (e.g., 500 or less) and the reverse if the number of examinees is large (e.g., 5000 or more). An asymptotic experiment-wise error rate criterion was algebraically derived. This same criterion can also be applied to the Mantel-Haenszel statistic. Bonferroni test statistics were found to provide excellent approximations to the (asymptotically) exact test statistics.
一些采用基于拉施模型的共同项目等值程序的州评估项目,仅使用那些两个测试的拉施项目难度参数估计值相差小于0.3对数单位的共同项目来估计等值常数。本研究结果表明,如果考生数量较少(例如500名或更少),这种做法会导致从共同项目集中错误删除项目的概率虚高,而如果考生数量较多(例如5000名或更多),情况则相反。通过代数方法推导出了一个渐近实验性错误率标准。这个相同的标准也可以应用于曼特尔-亨泽尔统计量。发现邦费罗尼检验统计量能很好地近似(渐近)精确检验统计量。