基于奈曼-皮尔逊引理的统计量对其基本假设违背情况缺乏稳健性。

The Lack of Robustness of a Statistic Based on the Neyman-Pearson Lemma to Violations of Its Underlying Assumptions.

作者信息

Sinharay Sandip

机构信息

Educational Testing Service, Princeton, NJ, USA.

出版信息

Appl Psychol Meas. 2022 Jan;46(1):19-39. doi: 10.1177/01466216211049209. Epub 2021 Oct 23.

DOI:10.1177/01466216211049209

PMID:34898745

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8655463/

Abstract

Drasgow, Levine, and Zickar (1996) suggested a statistic based on the Neyman-Pearson lemma (NPL; e.g., Lehmann & Romano, 2005, p. 60) for detecting preknowledge on a known set of items. The statistic is a special case of the optimal appropriateness indices (OAIs) of Levine and Drasgow (1988) and is the most powerful statistic for detecting item preknowledge when the assumptions underlying the statistic hold for the data (e.g., Belov, 2016Belov, 2016; Drasgow et al., 1996). This paper demonstrated using real data analysis that one assumption underlying the statistic of Drasgow et al. (1996) is often likely to be violated in practice. This paper also demonstrated, using simulated data, that the statistic is not robust to realistic violations of its underlying assumptions. Together, the results from the real data and the simulations demonstrate that the statistic of Drasgow et al. (1996) may not always be the optimum statistic in practice and occasionally has smaller power than another statistic for detecting preknowledge on a known set of items, especially when the assumptions underlying the former statistic do not hold. The findings of this paper demonstrate the importance of keeping in mind the assumptions underlying and the limitations of any statistic or method.

摘要

德拉斯戈、莱文和齐卡尔（1996年）提出了一种基于奈曼-皮尔逊引理（NPL；例如，莱曼和罗曼诺，2005年，第60页）的统计量，用于检测在一组已知项目上的预先知晓情况。该统计量是莱文和德拉斯戈（1988年）的最优适宜性指数（OAIs）的一个特例，并且当该统计量所依据的假设对数据成立时（例如，别洛夫，2016年；德拉斯戈等人，1996年），它是检测项目预先知晓情况的最具功效的统计量。本文通过实际数据分析表明，德拉斯戈等人（1996年）的统计量所依据的一个假设在实际中常常可能被违背。本文还通过模拟数据表明，该统计量对于其基本假设的实际违背情况并不稳健。综合来看，实际数据和模拟结果表明，德拉斯戈等人（1996年）的统计量在实际中可能并不总是最优统计量，并且在检测一组已知项目上的预先知晓情况时，偶尔比另一个统计量的功效更小，尤其是当前者所依据的假设不成立时。本文的研究结果表明了牢记任何统计量或方法所依据的假设及其局限性的重要性。

相似文献

The Lack of Robustness of a Statistic Based on the Neyman-Pearson Lemma to Violations of Its Underlying Assumptions.基于奈曼-皮尔逊引理的统计量对其基本假设违背情况缺乏稳健性。

Appl Psychol Meas. 2022 Jan;46(1):19-39. doi: 10.1177/01466216211049209. Epub 2021 Oct 23.

On the Equivalence of a Likelihood Ratio of Drasgow, Levine, and Zickar (1996) and the Statistic Based on the Neyman-Pearson Lemma of Belov (2016).关于德拉斯戈、莱文和齐卡尔（1996年）似然比与基于别洛夫（2016年）奈曼 - 皮尔逊引理的统计量的等价性

Appl Psychol Meas. 2017 Mar;41(2):145-149. doi: 10.1177/0146621616673597. Epub 2016 Oct 24.

Which Statistic Should Be Used to Detect Item Preknowledge When the Set of Compromised Items Is Known?当已知被泄露题目集时，应使用哪种统计量来检测题目预知识？

Appl Psychol Meas. 2017 Sep;41(6):403-421. doi: 10.1177/0146621617698453. Epub 2017 Mar 26.

Comparing the Performance of Eight Item Preknowledge Detection Statistics.比较八项预知识检测统计量的性能。

Appl Psychol Meas. 2016 Mar;40(2):83-97. doi: 10.1177/0146621615603327. Epub 2015 Sep 9.

The use of item scores and response times to detect examinees who may have benefited from item preknowledge.利用项目得分和反应时间来检测可能从项目先验知识中受益的考生。

Br J Math Stat Psychol. 2020 Nov;73(3):397-419. doi: 10.1111/bmsp.12187. Epub 2019 Aug 16.

Detection of Item Preknowledge Using Response Times.利用反应时间检测项目预知识

Appl Psychol Meas. 2020 Jul;44(5):376-392. doi: 10.1177/0146621620909893. Epub 2020 Apr 13.

Detecting Item Preknowledge Using a Predictive Checking Method.使用预测性检查方法检测项目预知识。

Appl Psychol Meas. 2017 Jun;41(4):243-263. doi: 10.1177/0146621616687285. Epub 2017 Jan 22.

Global Validation of Linear Model Assumptions.线性模型假设的全局验证

J Am Stat Assoc. 2006 Mar 1;101(473):341. doi: 10.1198/016214505000000637.

Detecting Examinees With Item Preknowledge on Real Data.在真实数据上检测具有题目先验知识的考生。

Appl Psychol Meas. 2022 Jun;46(4):273-287. doi: 10.1177/01466216221084202. Epub 2022 Apr 21.

Asymptotically Correct Standardization of Person-Fit Statistics Beyond Dichotomous Items.二分法项目之外的人适切性统计量的渐近正确标准化

Psychometrika. 2016 Dec;81(4):992-1013. doi: 10.1007/s11336-015-9465-x. Epub 2015 May 8.

引用本文的文献

Two New Models for Item Preknowledge.项目预知识的两种新模型。

Appl Psychol Meas. 2022 Sep;46(6):447-461. doi: 10.1177/01466216221108130. Epub 2022 Jun 22.

本文引用的文献

Summed Score Likelihood-Based Indices for Testing Latent Variable Distribution Fit in Item Response Theory.基于总分似然性的指标，用于检验项目反应理论中潜在变量分布的拟合度。

Educ Psychol Meas. 2018 Oct;78(5):857-886. doi: 10.1177/0013164417717024. Epub 2017 Jul 7.

Higher-Order Asymptotics and Its Application to Testing the Equality of the Examinee Ability Over Two Sets of Items.高阶渐近及其在检验两组项目中考生能力相等性上的应用。

Psychometrika. 2019 Jun;84(2):484-510. doi: 10.1007/s11336-018-9627-8. Epub 2018 Jun 27.

Which Statistic Should Be Used to Detect Item Preknowledge When the Set of Compromised Items Is Known?当已知被泄露题目集时，应使用哪种统计量来检测题目预知识？

Appl Psychol Meas. 2017 Sep;41(6):403-421. doi: 10.1177/0146621617698453. Epub 2017 Mar 26.

Detecting Item Preknowledge Using a Predictive Checking Method.使用预测性检查方法检测项目预知识。

Appl Psychol Meas. 2017 Jun;41(4):243-263. doi: 10.1177/0146621616687285. Epub 2017 Jan 22.

Appl Psychol Meas. 2017 Mar;41(2):145-149. doi: 10.1177/0146621616673597. Epub 2016 Oct 24.

Comparing the Performance of Eight Item Preknowledge Detection Statistics.比较八项预知识检测统计量的性能。

Appl Psychol Meas. 2016 Mar;40(2):83-97. doi: 10.1177/0146621615603327. Epub 2015 Sep 9.

Detecting Test Tampering Using Item Response Theory.使用项目反应理论检测考试作弊行为。

Educ Psychol Meas. 2015 Dec;75(6):931-953. doi: 10.1177/0013164414568716. Epub 2015 Jan 23.

Item Response Theory with Estimation of the Latent Population Distribution Using Spline-Based Densities.使用基于样条密度估计潜在总体分布的项目反应理论。

Psychometrika. 2006 Jun;71(2):281. doi: 10.1007/s11336-004-1175-8. Epub 2017 Feb 11.

The meaning and use of the area under a receiver operating characteristic (ROC) curve.接受者操作特征（ROC）曲线下面积的意义及应用。

Radiology. 1982 Apr;143(1):29-36. doi: 10.1148/radiology.143.1.7063747.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。