Suppr超能文献

基于奈曼-皮尔逊引理的统计量对其基本假设违背情况缺乏稳健性。

The Lack of Robustness of a Statistic Based on the Neyman-Pearson Lemma to Violations of Its Underlying Assumptions.

作者信息

Sinharay Sandip

机构信息

Educational Testing Service, Princeton, NJ, USA.

出版信息

Appl Psychol Meas. 2022 Jan;46(1):19-39. doi: 10.1177/01466216211049209. Epub 2021 Oct 23.

Abstract

Drasgow, Levine, and Zickar (1996) suggested a statistic based on the Neyman-Pearson lemma (NPL; e.g., Lehmann & Romano, 2005, p. 60) for detecting preknowledge on a known set of items. The statistic is a special case of the optimal appropriateness indices (OAIs) of Levine and Drasgow (1988) and is the most powerful statistic for detecting item preknowledge when the assumptions underlying the statistic hold for the data (e.g., Belov, 2016Belov, 2016; Drasgow et al., 1996). This paper demonstrated using real data analysis that one assumption underlying the statistic of Drasgow et al. (1996) is often likely to be violated in practice. This paper also demonstrated, using simulated data, that the statistic is not robust to realistic violations of its underlying assumptions. Together, the results from the real data and the simulations demonstrate that the statistic of Drasgow et al. (1996) may not always be the optimum statistic in practice and occasionally has smaller power than another statistic for detecting preknowledge on a known set of items, especially when the assumptions underlying the former statistic do not hold. The findings of this paper demonstrate the importance of keeping in mind the assumptions underlying and the limitations of any statistic or method.

摘要

德拉斯戈、莱文和齐卡尔(1996年)提出了一种基于奈曼-皮尔逊引理(NPL;例如,莱曼和罗曼诺,2005年,第60页)的统计量,用于检测在一组已知项目上的预先知晓情况。该统计量是莱文和德拉斯戈(1988年)的最优适宜性指数(OAIs)的一个特例,并且当该统计量所依据的假设对数据成立时(例如,别洛夫,2016年;德拉斯戈等人,1996年),它是检测项目预先知晓情况的最具功效的统计量。本文通过实际数据分析表明,德拉斯戈等人(1996年)的统计量所依据的一个假设在实际中常常可能被违背。本文还通过模拟数据表明,该统计量对于其基本假设的实际违背情况并不稳健。综合来看,实际数据和模拟结果表明,德拉斯戈等人(1996年)的统计量在实际中可能并不总是最优统计量,并且在检测一组已知项目上的预先知晓情况时,偶尔比另一个统计量的功效更小,尤其是当前者所依据的假设不成立时。本文的研究结果表明了牢记任何统计量或方法所依据的假设及其局限性的重要性。

相似文献

4
Comparing the Performance of Eight Item Preknowledge Detection Statistics.比较八项预知识检测统计量的性能。
Appl Psychol Meas. 2016 Mar;40(2):83-97. doi: 10.1177/0146621615603327. Epub 2015 Sep 9.
6
Detection of Item Preknowledge Using Response Times.利用反应时间检测项目预知识
Appl Psychol Meas. 2020 Jul;44(5):376-392. doi: 10.1177/0146621620909893. Epub 2020 Apr 13.
7
Detecting Item Preknowledge Using a Predictive Checking Method.使用预测性检查方法检测项目预知识。
Appl Psychol Meas. 2017 Jun;41(4):243-263. doi: 10.1177/0146621616687285. Epub 2017 Jan 22.
8
Global Validation of Linear Model Assumptions.线性模型假设的全局验证
J Am Stat Assoc. 2006 Mar 1;101(473):341. doi: 10.1198/016214505000000637.
9
Detecting Examinees With Item Preknowledge on Real Data.在真实数据上检测具有题目先验知识的考生。
Appl Psychol Meas. 2022 Jun;46(4):273-287. doi: 10.1177/01466216221084202. Epub 2022 Apr 21.

引用本文的文献

1
Two New Models for Item Preknowledge.项目预知识的两种新模型。
Appl Psychol Meas. 2022 Sep;46(6):447-461. doi: 10.1177/01466216221108130. Epub 2022 Jun 22.

本文引用的文献

4
Detecting Item Preknowledge Using a Predictive Checking Method.使用预测性检查方法检测项目预知识。
Appl Psychol Meas. 2017 Jun;41(4):243-263. doi: 10.1177/0146621616687285. Epub 2017 Jan 22.
6
Comparing the Performance of Eight Item Preknowledge Detection Statistics.比较八项预知识检测统计量的性能。
Appl Psychol Meas. 2016 Mar;40(2):83-97. doi: 10.1177/0146621615603327. Epub 2015 Sep 9.
7
Detecting Test Tampering Using Item Response Theory.使用项目反应理论检测考试作弊行为。
Educ Psychol Meas. 2015 Dec;75(6):931-953. doi: 10.1177/0013164414568716. Epub 2015 Jan 23.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验