• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

计算NEAT设计中测验分数等值的偏差

Calculating Bias in Test Score Equating in a NEAT Design.

作者信息

Wiberg Marie, Laukaityte Inga

机构信息

Umeå University, Sweden.

出版信息

Appl Psychol Meas. 2025 Mar 24:01466216251330305. doi: 10.1177/01466216251330305.

DOI:10.1177/01466216251330305
PMID:40162326
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11948241/
Abstract

Test score equating is used to make scores from different test forms comparable, even when groups differ in ability. In practice, the non-equivalent group with anchor test (NEAT) design is commonly used. The overall aim was to compare the amount of bias under different conditions when using either chained equating or frequency estimation with five different criterion functions: the identity function, linear equating, equipercentile, chained equating and frequency estimation. We used real test data from a multiple-choice binary scored college admissions test to illustrate that the choice of criterion function matter. Further, we simulated data in line with the empirical data to examine difference in ability between groups, difference in item difficulty, difference in anchor test form and regular test form length, difference in correlations between anchor test form and regular test forms, and different sample size. The results indicate that how bias is defined heavily affects the conclusions we draw about which equating method is to be preferred in different scenarios. Practical implications of this in standardized tests are given together with recommendations on how to calculate bias when evaluating equating transformations.

摘要

测验分数等值用于使不同测验形式的分数具有可比性,即使不同群体的能力存在差异。在实际应用中,常用的是带锚定测验的非等组设计(NEAT)。总体目标是比较在使用链式等值或频率估计时,采用五种不同的准则函数(恒等函数、线性等值、等百分位等值、链式等值和频率估计)在不同条件下的偏差量。我们使用了来自一个多项选择题二分计分的大学入学考试的真实测验数据,以说明准则函数的选择很重要。此外,我们根据实证数据模拟数据,以检验群体间能力差异、题目难度差异、锚定测验形式和常规测验形式长度差异、锚定测验形式与常规测验形式之间的相关性差异以及不同样本量的情况。结果表明,偏差的定义方式严重影响我们在不同场景下关于哪种等值方法更优得出的结论。文中给出了这在标准化测验中的实际意义,以及关于在评估等值转换时如何计算偏差的建议。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cd3/11948241/5ec3727e9876/10.1177_01466216251330305-fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cd3/11948241/23240e3c2420/10.1177_01466216251330305-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cd3/11948241/d85661565a10/10.1177_01466216251330305-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cd3/11948241/3310629b1577/10.1177_01466216251330305-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cd3/11948241/305eae1b6937/10.1177_01466216251330305-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cd3/11948241/f7ec6cdb17b2/10.1177_01466216251330305-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cd3/11948241/5ec3727e9876/10.1177_01466216251330305-fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cd3/11948241/23240e3c2420/10.1177_01466216251330305-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cd3/11948241/d85661565a10/10.1177_01466216251330305-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cd3/11948241/3310629b1577/10.1177_01466216251330305-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cd3/11948241/305eae1b6937/10.1177_01466216251330305-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cd3/11948241/f7ec6cdb17b2/10.1177_01466216251330305-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cd3/11948241/5ec3727e9876/10.1177_01466216251330305-fig6.jpg

相似文献

1
Calculating Bias in Test Score Equating in a NEAT Design.计算NEAT设计中测验分数等值的偏差
Appl Psychol Meas. 2025 Mar 24:01466216251330305. doi: 10.1177/01466216251330305.
2
Evaluating Equating Methods for Varying Levels of Form Difference.评估不同形式差异水平下的等值方法。
Educ Psychol Meas. 2024 Jun;84(3):510-529. doi: 10.1177/00131644231176989. Epub 2023 Jun 8.
3
Kernel Equating Under the Non-Equivalent Groups With Covariates Design.具有协变量设计的非等组下的核等值法
Appl Psychol Meas. 2015 Jul;39(5):349-361. doi: 10.1177/0146621614567939. Epub 2015 Jan 20.
4
Local Equating of Cognitively Diagnostic Modeled Observed Scores.认知诊断模型观测分数的局部等值
Appl Psychol Meas. 2015 Jan;39(1):44-61. doi: 10.1177/0146621614542427. Epub 2014 Jul 23.
5
Item Response Theory Observed-Score Kernel Equating.项目反应理论观察分数核等值法
Psychometrika. 2017 Mar;82(1):48-66. doi: 10.1007/s11336-016-9528-7. Epub 2016 Oct 14.
6
A Comparison of IRT Observed Score Kernel Equating and Several Equating Methods.IRT观测分数核等值法与几种等值方法的比较
Front Psychol. 2020 Mar 6;11:308. doi: 10.3389/fpsyg.2020.00308. eCollection 2020.
7
Efficiency Analysis of Item Response Theory Kernel Equating for Mixed-Format Tests.混合格式测验的项目反应理论核等值效率分析
Appl Psychol Meas. 2023 Nov;47(7-8):496-512. doi: 10.1177/01466216231209757. Epub 2023 Oct 19.
8
The NEAT Equating Via Chaining Random Forests in the Context of Small Sample Sizes: A Machine-Learning Method.小样本量情况下通过链式随机森林实现的NEAT等值性:一种机器学习方法
Educ Psychol Meas. 2023 Oct;83(5):984-1006. doi: 10.1177/00131644221120899. Epub 2022 Sep 4.
9
Comparison of proficiency in an anesthesiology course across distinct medical student cohorts: psychometric approaches to test equating.不同医学学生群体在麻醉学课程中的熟练程度比较:用于测试等值性的心理测量方法。
J Chin Med Assoc. 2014 Mar;77(3):150-4. doi: 10.1016/j.jcma.2013.10.011. Epub 2013 Nov 28.
10
Outlier Detection Using t-test in Rasch IRT Equating under NEAT Design.在NEAT设计下,基于t检验的拉施IRT等值中的异常值检测
Appl Psychol Meas. 2023 Jan;47(1):34-47. doi: 10.1177/01466216221124045. Epub 2022 Sep 6.

本文引用的文献

1
Efficiency Analysis of Item Response Theory Kernel Equating for Mixed-Format Tests.混合格式测验的项目反应理论核等值效率分析
Appl Psychol Meas. 2023 Nov;47(7-8):496-512. doi: 10.1177/01466216231209757. Epub 2023 Oct 19.
2
Evaluating Equating Transformations in IRT Observed-Score and Kernel Equating Methods.评估IRT观测分数等值转换及核等值方法
Appl Psychol Meas. 2023 Mar;47(2):123-140. doi: 10.1177/01466216221124087. Epub 2022 Oct 4.
3
A Comparison of IRT Observed Score Kernel Equating and Several Equating Methods.
IRT观测分数核等值法与几种等值方法的比较
Front Psychol. 2020 Mar 6;11:308. doi: 10.3389/fpsyg.2020.00308. eCollection 2020.
4
Simple-Structure Multidimensional Item Response Theory Equating for Multidimensional Tests.用于多维测试的简单结构多维项目反应理论等值法
Educ Psychol Meas. 2020 Feb;80(1):91-125. doi: 10.1177/0013164419854208. Epub 2019 Jun 14.
5
Linking With External Covariates: Examining Accuracy by Anchor Type, Test Length, Ability Difference, and Sample Size.与外部协变量的关联:按锚定类型、测试长度、能力差异和样本量检验准确性。
Appl Psychol Meas. 2019 Nov;43(8):597-610. doi: 10.1177/0146621618824855. Epub 2019 Feb 14.