• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

标准差重要吗?使用“标准差”量化多阶段测试的安全性。

Does standard deviation matter? Using "standard deviation" to quantify security of multistage testing.

作者信息

Wang Chun, Zheng Yi, Chang Hua-Hua

机构信息

University of Minnesota at Twin-Cities, 75 East River Road, Elliott Hall N658, Minneapolis, MN, 55403, USA,

出版信息

Psychometrika. 2014 Jan;79(1):154-74. doi: 10.1007/s11336-013-9356-y. Epub 2013 Dec 10.

DOI:10.1007/s11336-013-9356-y
PMID:24323297
Abstract

With the advent of web-based technology, online testing is becoming a mainstream mode in large-scale educational assessments. Most online tests are administered continuously in a testing window, which may post test security problems because examinees who take the test earlier may share information with those who take the test later. Researchers have proposed various statistical indices to assess the test security, and one most often used index is the average test-overlap rate, which was further generalized to the item pooling index (Chang & Zhang, 2002, 2003). These indices, however, are all defined as the means (that is, the expected proportion of common items among examinees) and they were originally proposed for computerized adaptive testing (CAT). Recently, multistage testing (MST) has become a popular alternative to CAT. The unique features of MST make it important to report not only the mean, but also the standard deviation (SD) of test overlap rate, as we advocate in this paper. The standard deviation of test overlap rate adds important information to the test security profile, because for the same mean, a large SD reflects that certain groups of examinees share more common items than other groups. In this study, we analytically derived the lower bounds of the SD under MST, with the results under CAT as a benchmark. It is shown that when the mean overlap rate is the same between MST and CAT, the SD of test overlap tends to be larger in MST. A simulation study was conducted to provide empirical evidence. We also compared the security of MST under the single-pool versus the multiple-pool designs; both analytical and simulation studies show that the non-overlapping multiple-pool design will slightly increase the security risk.

摘要

随着基于网络技术的出现,在线测试正成为大规模教育评估中的一种主流模式。大多数在线测试在一个测试窗口内连续进行,这可能会带来测试安全问题,因为较早参加测试的考生可能会与较晚参加测试的考生分享信息。研究人员提出了各种统计指标来评估测试安全性,其中最常用的指标之一是平均测试重叠率,该指标后来进一步推广为项目池指标(Chang & Zhang,2002年,2003年)。然而,这些指标都被定义为均值(即考生之间共同项目的预期比例),并且它们最初是为计算机自适应测试(CAT)提出的。最近,多阶段测试(MST)已成为CAT的一种流行替代方案。正如我们在本文中所主张的,MST的独特特征使得不仅报告测试重叠率的均值,而且报告其标准差(SD)变得很重要。测试重叠率的标准差为测试安全概况增添了重要信息,因为对于相同的均值,较大的标准差反映出某些考生群体比其他群体共享更多的共同项目。在本研究中,我们以CAT的结果为基准,通过分析得出了MST下标准差的下限。结果表明,当MST和CAT的平均重叠率相同时,MST中测试重叠的标准差往往更大。我们进行了一项模拟研究以提供实证证据。我们还比较了单池设计与多池设计下MST的安全性;分析和模拟研究均表明,不重叠的多池设计会略微增加安全风险。

相似文献

1
Does standard deviation matter? Using "standard deviation" to quantify security of multistage testing.标准差重要吗?使用“标准差”量化多阶段测试的安全性。
Psychometrika. 2014 Jan;79(1):154-74. doi: 10.1007/s11336-013-9356-y. Epub 2013 Dec 10.
2
On-the-Fly Assembled Multistage Adaptive Testing.动态组装多级自适应测试
Appl Psychol Meas. 2015 Mar;39(2):104-118. doi: 10.1177/0146621614544519. Epub 2014 Sep 5.
3
Optimal number of strata for the stratified methods in computerized adaptive testing.最优分层数在计算机自适应测验中的分层方法。
Span J Psychol. 2014;17:E48. doi: 10.1017/sjp.2014.50.
4
Developing new online calibration methods for multidimensional computerized adaptive testing.开发用于多维计算机自适应测试的新型在线校准方法。
Br J Math Stat Psychol. 2017 Feb;70(1):81-117. doi: 10.1111/bmsp.12083.
5
Investigating the relationship between item exposure and test overlap: item sharing and item pooling.考察项目暴露与测试重叠之间的关系:项目共享和项目汇集。
Br J Math Stat Psychol. 2010 Feb;63(Pt 1):205-26. doi: 10.1348/000711009X430906. Epub 2009 Jun 19.
6
The Asymptotic Distribution of Average Test Overlap Rate in Computerized Adaptive Testing.计算机化自适应测验中平均测验重叠率的渐近分布。
Psychometrika. 2019 Dec;84(4):1129-1151. doi: 10.1007/s11336-019-09674-5. Epub 2019 Jul 1.
7
Comparing single-pool and multiple-pool designs regarding test security in computerized testing.比较单池和多池设计在计算机化测试中的测试安全性。
Behav Res Methods. 2012 Sep;44(3):742-52. doi: 10.3758/s13428-011-0178-5.
8
Adapting cognitive diagnosis computerized adaptive testing item selection rules to traditional item response theory.将认知诊断计算机化自适应测验选题规则适配到传统项目反应理论中。
PLoS One. 2020 Jan 10;15(1):e0227196. doi: 10.1371/journal.pone.0227196. eCollection 2020.
9
Adjusted Residuals for Evaluating Conditional Independence in IRT Models for Multistage Adaptive Testing.调整残差在多阶段自适应测试中IRT 模型条件独立性评估中的应用。
Psychometrika. 2024 Mar;89(1):317-346. doi: 10.1007/s11336-023-09935-4. Epub 2023 Nov 6.
10
Computerized adaptive testing: a mixture item selection approach for constrained situations.计算机自适应测试:一种用于受限情况的混合项目选择方法。
Br J Math Stat Psychol. 2005 Nov;58(Pt 2):239-57. doi: 10.1348/000711005X62945.

引用本文的文献

1
On-the-fly parameter estimation based on item response theory in item-based adaptive learning systems.基于项目的自适应学习系统中基于项目反应理论的即时参数估计。
Behav Res Methods. 2023 Sep;55(6):3260-3280. doi: 10.3758/s13428-022-01953-x. Epub 2022 Sep 9.
2
The Asymptotic Distribution of Average Test Overlap Rate in Computerized Adaptive Testing.计算机化自适应测验中平均测验重叠率的渐近分布。
Psychometrika. 2019 Dec;84(4):1129-1151. doi: 10.1007/s11336-019-09674-5. Epub 2019 Jul 1.
3
On-the-Fly Assembled Multistage Adaptive Testing.

本文引用的文献

1
Comparing single-pool and multiple-pool designs regarding test security in computerized testing.比较单池和多池设计在计算机化测试中的测试安全性。
Behav Res Methods. 2012 Sep;44(3):742-52. doi: 10.3758/s13428-011-0178-5.
2
Rotating item banks versus restriction of maximum exposure rates in computerized adaptive testing.计算机自适应测试中旋转题库与最大暴露率限制
Span J Psychol. 2008 Nov;11(2):618-25.
3
The maximum priority index method for severely constrained item selection in computerized adaptive testing.计算机化自适应测试中严重受限项目选择的最大优先级指数法。
动态组装多级自适应测试
Appl Psychol Meas. 2015 Mar;39(2):104-118. doi: 10.1177/0146621614544519. Epub 2014 Sep 5.
Br J Math Stat Psychol. 2009 May;62(Pt 2):369-83. doi: 10.1348/000711008X304376. Epub 2008 Jun 2.