高阶渐近及其在检验两组项目中考生能力相等性上的应用。

Higher-Order Asymptotics and Its Application to Testing the Equality of the Examinee Ability Over Two Sets of Items.

机构信息

Educational Testing Service, Princeton, USA.

Aarhus University, Aarhus, Denmark.

出版信息

Psychometrika. 2019 Jun;84(2):484-510. doi: 10.1007/s11336-018-9627-8. Epub 2018 Jun 27.

DOI:10.1007/s11336-018-9627-8

PMID:29951971

Abstract

In educational and psychological measurement, researchers and/or practitioners are often interested in examining whether the ability of an examinee is the same over two sets of items. Such problems can arise in measurement of change, detection of cheating on unproctored tests, erasure analysis, detection of item preknowledge, etc. Traditional frequentist approaches that are used in such problems include the Wald test, the likelihood ratio test, and the score test (e.g., Fischer, Appl Psychol Meas 27:3-26, 2003; Finkelman, Weiss, & Kim-Kang, Appl Psychol Meas 34:238-254, 2010; Glas & Dagohoy, Psychometrika 72:159-180, 2007; Guo & Drasgow, Int J Sel Assess 18:351-364, 2010; Klauer & Rettig, Br J Math Stat Psychol 43:193-206, 1990; Sinharay, J Educ Behav Stat 42:46-68, 2017). This paper shows that approaches based on higher-order asymptotics (e.g., Barndorff-Nielsen & Cox, Inference and asymptotics. Springer, London, 1994; Ghosh, Higher order asymptotics. Institute of Mathematical Statistics, Hayward, 1994) can also be used to test for the equality of the examinee ability over two sets of items. The modified signed likelihood ratio test (e.g., Barndorff-Nielsen, Biometrika 73:307-322, 1986) and the Lugannani-Rice approximation (Lugannani & Rice, Adv Appl Prob 12:475-490, 1980), both of which are based on higher-order asymptotics, are shown to provide some improvement over the traditional frequentist approaches in three simulations. Two real data examples are also provided.

摘要

在教育和心理测量中，研究人员和/或从业者通常有兴趣检查考生在两套项目上的能力是否相同。这种问题可能出现在测量变化、检测无人监考测试中的作弊、擦除分析、检测项目先验知识等方面。在这种问题中使用的传统频率派方法包括 Wald 检验、似然比检验和得分检验（例如，Fischer， Appl Psychol Meas 27：3-26，2003；Finkelman，Weiss 和 Kim-Kang， Appl Psychol Meas 34：238-254，2010；Glas 和 Dagohoy， Psychometrika 72：159-180，2007；Guo 和 Drasgow， Int J Sel Assess 18：351-364，2010；Klauer 和 Rettig， Br J Math Stat Psychol 43：193-206，1990；Sinharay，J Educ Behav Stat 42：46-68，2017）。本文表明，基于高阶渐近的方法（例如，Barndorff-Nielsen 和 Cox，Inference and asymptotics. Springer，London，1994；Ghosh，Higher order asymptotics. Institute of Mathematical Statistics，Hayward，1994）也可用于检验考生在两套项目上的能力是否相等。基于高阶渐近的修改后的符号似然比检验（例如，Barndorff-Nielsen，Biometrika 73：307-322，1986）和 Lugannani-Rice 逼近（Lugannani 和 Rice，Adv Appl Prob 12：475-490，1980）在三个模拟中均被证明优于传统的频率派方法。还提供了两个真实数据示例。

相似文献

Higher-Order Asymptotics and Its Application to Testing the Equality of the Examinee Ability Over Two Sets of Items.高阶渐近及其在检验两组项目中考生能力相等性上的应用。

Psychometrika. 2019 Jun;84(2):484-510. doi: 10.1007/s11336-018-9627-8. Epub 2018 Jun 27.

The choice of the ability estimate with asymptotically correct standardized person-fit statistics.使用渐近正确的标准化个人拟合统计量进行能力估计的选择。

Br J Math Stat Psychol. 2016 May;69(2):175-93. doi: 10.1111/bmsp.12067. Epub 2016 Apr 5.

Cheating on Unproctored Internet Test Applications: An Analysis of a Verification Test in a Real Personnel Selection Context.非监考网络测试应用中的作弊行为：真实人员选拔背景下一项验证测试的分析

Span J Psychol. 2018 Dec 3;21:E62. doi: 10.1017/sjp.2018.50.

Asymptotically Correct Standardization of Person-Fit Statistics Beyond Dichotomous Items.二分法项目之外的人适切性统计量的渐近正确标准化

Psychometrika. 2016 Dec;81(4):992-1013. doi: 10.1007/s11336-015-9465-x. Epub 2015 May 8.

Are Exam Questions Known in Advance? Using Local Dependence to Detect Cheating.考试题目会提前泄露吗？利用局部相依性来检测作弊行为。

PLoS One. 2016 Dec 1;11(12):e0167545. doi: 10.1371/journal.pone.0167545. eCollection 2016.

Which Statistic Should Be Used to Detect Item Preknowledge When the Set of Compromised Items Is Known?当已知被泄露题目集时，应使用哪种统计量来检测题目预知识？

Appl Psychol Meas. 2017 Sep;41(6):403-421. doi: 10.1177/0146621617698453. Epub 2017 Mar 26.

Detecting Cheating Methods on Unproctored Internet Tests.检测无监考网络考试中的作弊方法。

Psicothema. 2020 Nov;32(4):549-558. doi: 10.7334/psicothema2020.86.

The use of item scores and response times to detect examinees who may have benefited from item preknowledge.利用项目得分和反应时间来检测可能从项目先验知识中受益的考生。

Br J Math Stat Psychol. 2020 Nov;73(3):397-419. doi: 10.1111/bmsp.12187. Epub 2019 Aug 16.

Some Remarks on Applications of Tests for Detecting A Change Point to Psychometric Problems.关于应用检测变点的检验方法于心理计量学问题的一些注记。

Psychometrika. 2017 Dec;82(4):1149-1161. doi: 10.1007/s11336-016-9531-z. Epub 2016 Oct 21.

Caught in the Act: Predicting Cheating in Unproctored Knowledge Assessment.当场抓获：预测无监考知识评估中的作弊行为

Assessment. 2021 Apr;28(3):1004-1017. doi: 10.1177/1073191120914970. Epub 2020 May 1.

引用本文的文献

Application of Bayesian Decision Theory in Detecting Test Fraud.贝叶斯决策理论在检测考试舞弊中的应用

Appl Psychol Meas. 2025 Jan 27:01466216251316559. doi: 10.1177/01466216251316559.

The Use of Theory of Linear Mixed-Effects Models to Detect Fraudulent Erasures at an Aggregate Level.使用线性混合效应模型理论在总体层面检测欺诈性删除数据行为。

Educ Psychol Meas. 2022 Feb;82(1):177-200. doi: 10.1177/0013164421994893. Epub 2021 Mar 29.

The Lack of Robustness of a Statistic Based on the Neyman-Pearson Lemma to Violations of Its Underlying Assumptions.基于奈曼-皮尔逊引理的统计量对其基本假设违背情况缺乏稳健性。

Appl Psychol Meas. 2022 Jan;46(1):19-39. doi: 10.1177/01466216211049209. Epub 2021 Oct 23.

本文引用的文献

Detecting Test Tampering Using Item Response Theory.使用项目反应理论检测考试作弊行为。

Educ Psychol Meas. 2015 Dec;75(6):931-953. doi: 10.1177/0013164414568716. Epub 2015 Jan 23.

Saddlepoint Approximations of the Distribution of the Person Parameter in the Two Parameter Logistic Model.两参数逻辑模型中人员参数分布的鞍点近似

Psychometrika. 2015 Sep;80(3):665-88. doi: 10.1007/s11336-014-9405-1. Epub 2014 Apr 8.

On the Unidentifiability of the Fixed-Effects 3PL Model.关于固定效应三参数逻辑斯蒂模型的不可识别性

Psychometrika. 2015 Jun;80(2):450-67. doi: 10.1007/s11336-014-9404-2. Epub 2014 Jan 31.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

高阶渐近及其在检验两组项目中考生能力相等性上的应用。

Higher-Order Asymptotics and Its Application to Testing the Equality of the Examinee Ability Over Two Sets of Items.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献