• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在Rasch模型中,使用单独校准的t统计量和Mantel-Haenszel卡方统计量评估小样本中的差异性项目功能。

Assessing DIF among small samples with separate calibration t and Mantel-Haenszel χ² statistics in the Rasch model.

作者信息

Bernstein Ira, Samuels Ellery, Woo Ada, Hagge Sarah L

机构信息

National Council of State Boards of Nursing (NCSBN), 111 E. Wacker Drive, Ste. 2900, Chicago, IL 60601-4277, USA,

出版信息

J Appl Meas. 2013;14(4):389-99.

PMID:24064579
Abstract

The National Council Licensure Examination (NCLEX) program has evaluated differential item functioning (DIF) using the Mantel-Haenszel (M-H) chi-square statistic. Since a Rasch model is assumed, DIF implies a difference in item difficulty between a reference group, e.g., White applicants, and a focal group, e.g., African-American applicants. The National Council of State Boards of Nursing (NCSBN) is planning to change the statistic used to evaluate DIF on the NCLEX from M-H to the separate calibration t-test (t). In actuality, M-H and t should yield identical results in large samples if the assumptions of the Rasch model hold (Linacre and Wright, 1989, also see Smith, 1996). However, as is true throughout statistics, "how large is large" is undefined, so it is quite possible that systematic differences exist in relatively smaller samples. This paper compares M-H and t in four sets of computer simulations. Three simulations used a ten-item test with nine fair items and one potentially containing DIF. To address instability that may result from a ten-item test, the fourth used a 30-item test with 29 fair items and one potentially containing DIF. Depending upon the simulation, the magnitude of population DIF (0, .5, 1.0, and 1.5 z-score units), the ability difference between the focal and reference group (-1, 0, and 1 z-score units), the focal group size (0, 10, 20, 40, 50, 80, 160, and 1000), and the reference group size (500 and 1000) were varied. The results were that: (a) differences in estimated DIF between the M-H and t statistics are generally small, (b) t tends to estimate lower chance probabilities than M-H with small sample sizes, (c) neither method is likely to detect DIF, especially when it is of slight magnitude in small focal group sizes, and (d) M-H does marginally better than t at detecting DIF but this improvement is also limited to very small focal group sizes.

摘要

国家委员会执照考试(NCLEX)项目一直使用曼特尔 - 亨塞尔(M - H)卡方统计量来评估项目功能差异(DIF)。由于假定采用拉施模型,DIF意味着在一个参照组(如白人申请者)和一个焦点组(如非裔美国申请者)之间项目难度存在差异。美国国家州护士委员会(NCSBN)正计划将用于评估NCLEX上DIF的统计量从M - H改为单独校准t检验(t)。实际上,如果拉施模型的假设成立,在大样本中M - H和t应该会得出相同的结果(林纳克和赖特,1989年,另见史密斯,1996年)。然而,正如统计学中常见的那样,“多大算大”并无明确界定,所以在相对较小的样本中很可能存在系统差异。本文在四组计算机模拟中比较了M - H和t。三次模拟使用了一个包含九个公平项目和一个可能存在DIF的项目的十项测试。为了解决十项测试可能导致的不稳定性问题,第四次模拟使用了一个包含29个公平项目和一个可能存在DIF的项目的30项测试。根据模拟情况,总体DIF的大小(0、0.5、1.0和1.5个z分数单位)、焦点组和参照组之间的能力差异(-1、0和1个z分数单位)、焦点组规模(0、10、20、40、50、80、160和1000)以及参照组规模(500和1000)均有所变化。结果表明:(a)M - H和t统计量在估计DIF方面的差异通常较小;(b)在小样本量时,t倾向于比M - H估计出更低的概率;(c)两种方法都不太可能检测到DIF,尤其是当DIF在小焦点组规模中程度轻微时;(d)在检测DIF方面,M - H比t略好一些,但这种改进也仅限于非常小的焦点组规模。

相似文献

1
Assessing DIF among small samples with separate calibration t and Mantel-Haenszel χ² statistics in the Rasch model.在Rasch模型中,使用单独校准的t统计量和Mantel-Haenszel卡方统计量评估小样本中的差异性项目功能。
J Appl Meas. 2013;14(4):389-99.
2
DIF Cancellation in the Rasch Model.拉施模型中的差异项目功能消除
J Appl Meas. 2013;14(2):118-28.
3
Rasch fit statistics as a test of the invariance of item parameter estimates.拉施拟合统计作为项目参数估计不变性的一种检验。
J Appl Meas. 2003;4(2):153-63.
4
Angoff's delta method revisited: improving DIF detection under small samples.重新审视安戈夫差法:在小样本下提高 DIF 检测能力。
Br J Math Stat Psychol. 2012 May;65(2):302-21. doi: 10.1111/j.2044-8317.2011.02025.x. Epub 2011 Aug 19.
5
An extension of a bayesian approach to detect differential item functioning.一种用于检测项目功能差异的贝叶斯方法的扩展。
J Appl Meas. 2013;14(2):149-58.
6
Disease-related differential item functioning in the work instability scale for rheumatoid arthritis: converging results from three methods.疾病相关的差别项目在类风湿关节炎工作不稳定量表中的功能:三种方法的结果趋于一致。
Arthritis Care Res (Hoboken). 2011 Aug;63(8):1159-69. doi: 10.1002/acr.20491.
7
Exploring the Utility of Logistic Mixed Modeling Approaches to Simultaneously Investigate Item and Testlet DIF on Testlet-based Data.探索逻辑混合建模方法在基于测验项目组数据同时研究项目和测验项目组差异项目功能(DIF)方面的效用。
J Appl Meas. 2016;17(1):79-90.
8
Differential item functioning on the Mini-Mental State Examination. An application of the Mantel-Haenszel and standardization procedures.简易精神状态检查表中的项目功能差异。Mantel-Haenszel法与标准化程序的应用。
Med Care. 2006 Nov;44(11 Suppl 3):S107-14. doi: 10.1097/01.mlr.0000245182.36914.4a.
9
A New Stopping Criterion for Rasch Trees Based on the Mantel-Haenszel Effect Size Measure for Differential Item Functioning.一种基于用于项目功能差异的曼特尔-亨塞尔效应量度量的拉施树新停止准则。
Educ Psychol Meas. 2023 Feb;83(1):181-212. doi: 10.1177/00131644221077135. Epub 2022 Feb 28.
10
Type I error and statistical power of the Mantel-Haenszel procedure for detecting DIF: a meta-analysis.I 型错误和 Mantel-Haenszel 程序检测差异功能的统计功效:荟萃分析。
Psychol Methods. 2013 Dec;18(4):553-71. doi: 10.1037/a0034306. Epub 2013 Oct 14.