• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

结合倾向得分与共同项目进行测验分数等值

Combining Propensity Scores and Common Items for Test Score Equating.

作者信息

Laukaityte Inga, Wallin Gabriel, Wiberg Marie

机构信息

Department of Applied Educational Science, Umeå University, Sweden.

School of Mathematical Sciences, Lancaster University, UK.

出版信息

Appl Psychol Meas. 2025 Jul 30:01466216251363240. doi: 10.1177/01466216251363240.

DOI:10.1177/01466216251363240
PMID:40757034
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12310624/
Abstract

Ensuring that test scores are fair and comparable across different test forms and different test groups is a significant statistical challenge in educational testing. Methods to achieve score comparability, a process known as test score equating, often rely on including common test items or assuming that test taker groups are similar in key characteristics. This study explores a novel approach that combines propensity scores, based on test takers' background covariates, with information from common items using kernel smoothing techniques for binary-scored test items. An empirical analysis using data from a high-stakes college admissions test evaluates the standard errors and differences in adjusted test scores. A simulation study examines the impact of factors such as the number of test takers, the number of common items, and the correlation between covariates and test scores on the method's performance. The findings demonstrate that integrating propensity scores with common item information reduces standard errors and bias more effectively than using either source alone. This suggests that balancing the groups on the test-takers' covariates enhance the fairness and accuracy of test score comparisons across different groups. The proposed method highlights the benefits of considering all the collected data to improve score comparability.

摘要

确保不同测试形式和不同测试群体的考试分数公平且具有可比性,是教育测试中一项重大的统计挑战。实现分数可比性的方法,即所谓的考试分数等值过程,通常依赖于纳入共同测试项目或假设考生群体在关键特征上相似。本研究探索了一种新颖的方法,该方法将基于考生背景协变量的倾向得分与来自共同项目的信息相结合,对二元计分的测试项目使用核平滑技术。一项使用来自高风险大学入学考试数据的实证分析评估了调整后考试分数的标准误差和差异。一项模拟研究考察了考生数量、共同项目数量以及协变量与考试分数之间的相关性等因素对该方法性能的影响。研究结果表明,将倾向得分与共同项目信息相结合,比单独使用任何一种信息来源更有效地降低了标准误差和偏差。这表明在考生协变量上平衡群体,可提高不同群体间考试分数比较的公平性和准确性。所提出的方法凸显了考虑所有收集到的数据以提高分数可比性的益处。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/15020f4c7dd1/10.1177_01466216251363240-fig15.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/f277f4f7d761/10.1177_01466216251363240-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/57a7fc59a8ff/10.1177_01466216251363240-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/1e68d8db0027/10.1177_01466216251363240-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/a90d6950d80f/10.1177_01466216251363240-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/96beab33e936/10.1177_01466216251363240-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/1f0668e429cb/10.1177_01466216251363240-fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/6ececd312d19/10.1177_01466216251363240-fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/e8847f87f196/10.1177_01466216251363240-fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/cb747b7e97c3/10.1177_01466216251363240-fig9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/01a950af951e/10.1177_01466216251363240-fig10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/5b56042e5e13/10.1177_01466216251363240-fig11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/a72eeef5c935/10.1177_01466216251363240-fig12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/14a8aa674a45/10.1177_01466216251363240-fig13.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/7ea9f6a19818/10.1177_01466216251363240-fig14.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/15020f4c7dd1/10.1177_01466216251363240-fig15.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/f277f4f7d761/10.1177_01466216251363240-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/57a7fc59a8ff/10.1177_01466216251363240-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/1e68d8db0027/10.1177_01466216251363240-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/a90d6950d80f/10.1177_01466216251363240-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/96beab33e936/10.1177_01466216251363240-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/1f0668e429cb/10.1177_01466216251363240-fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/6ececd312d19/10.1177_01466216251363240-fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/e8847f87f196/10.1177_01466216251363240-fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/cb747b7e97c3/10.1177_01466216251363240-fig9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/01a950af951e/10.1177_01466216251363240-fig10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/5b56042e5e13/10.1177_01466216251363240-fig11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/a72eeef5c935/10.1177_01466216251363240-fig12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/14a8aa674a45/10.1177_01466216251363240-fig13.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/7ea9f6a19818/10.1177_01466216251363240-fig14.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e86/12310624/15020f4c7dd1/10.1177_01466216251363240-fig15.jpg

相似文献

1
Combining Propensity Scores and Common Items for Test Score Equating.结合倾向得分与共同项目进行测验分数等值
Appl Psychol Meas. 2025 Jul 30:01466216251363240. doi: 10.1177/01466216251363240.
2
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
3
Is It Possible to Develop a Patient-reported Experience Measure With Lower Ceiling Effect?是否有可能开发一种天花板效应较低的患者报告体验测量方法?
Clin Orthop Relat Res. 2025 Apr 1;483(4):693-703. doi: 10.1097/CORR.0000000000003262. Epub 2024 Oct 25.
4
Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.缺失数据的存在是否会影响 SORG 机器学习算法在脊柱转移瘤患者中的性能?开发一种互联网应用算法。
Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.
5
Systemic Inflammatory Response Syndrome全身炎症反应综合征
6
Audit and feedback: effects on professional practice.审核与反馈:对专业实践的影响
Cochrane Database Syst Rev. 2025 Mar 25;3(3):CD000259. doi: 10.1002/14651858.CD000259.pub4.
7
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
8
A New Measure of Quantified Social Health Is Associated With Levels of Discomfort, Capability, and Mental and General Health Among Patients Seeking Musculoskeletal Specialty Care.一种新的量化社会健康指标与寻求肌肉骨骼专科护理的患者的不适程度、能力以及心理和总体健康水平相关。
Clin Orthop Relat Res. 2025 Apr 1;483(4):647-663. doi: 10.1097/CORR.0000000000003394. Epub 2025 Feb 5.
9
Physical exercise training interventions for children and young adults during and after treatment for childhood cancer.针对儿童癌症治疗期间及治疗后的儿童和青少年的体育锻炼训练干预措施。
Cochrane Database Syst Rev. 2016 Mar 31;3(3):CD008796. doi: 10.1002/14651858.CD008796.pub3.
10
Sexual Harassment and Prevention Training性骚扰与预防培训

本文引用的文献

1
Efficiency Analysis of Item Response Theory Kernel Equating for Mixed-Format Tests.混合格式测验的项目反应理论核等值效率分析
Appl Psychol Meas. 2023 Nov;47(7-8):496-512. doi: 10.1177/01466216231209757. Epub 2023 Oct 19.
2
Evaluating Equating Transformations in IRT Observed-Score and Kernel Equating Methods.评估IRT观测分数等值转换及核等值方法
Appl Psychol Meas. 2023 Mar;47(2):123-140. doi: 10.1177/01466216221124087. Epub 2022 Oct 4.
3
How Important is the Choice of Bandwidth in Kernel Equating?核等值中带宽的选择有多重要?
Appl Psychol Meas. 2021 Oct;45(7-8):518-535. doi: 10.1177/01466216211040486. Epub 2021 Oct 20.
4
Linking With External Covariates: Examining Accuracy by Anchor Type, Test Length, Ability Difference, and Sample Size.与外部协变量的关联:按锚定类型、测试长度、能力差异和样本量检验准确性。
Appl Psychol Meas. 2019 Nov;43(8):597-610. doi: 10.1177/0146621618824855. Epub 2019 Feb 14.
5
Kernel Equating Under the Non-Equivalent Groups With Covariates Design.具有协变量设计的非等组下的核等值法
Appl Psychol Meas. 2015 Jul;39(5):349-361. doi: 10.1177/0146621614567939. Epub 2015 Jan 20.
6
Moving towards best practice when using inverse probability of treatment weighting (IPTW) using the propensity score to estimate causal treatment effects in observational studies.在观察性研究中,利用倾向得分采用治疗权重的逆概率(IPTW)估计因果治疗效果时,朝着最佳实践迈进。
Stat Med. 2015 Dec 10;34(28):3661-79. doi: 10.1002/sim.6607. Epub 2015 Aug 3.
7
A comparison of statistical selection strategies for univariate and bivariate log-linear models.单变量和双变量对数线性模型的统计选择策略比较。
Br J Math Stat Psychol. 2010 Nov;63(Pt 3):557-74. doi: 10.1348/000711009X478580. Epub 2009 Dec 22.
8
Goodness-of-fit diagnostics for the propensity score model when estimating treatment effects using covariate adjustment with the propensity score.在使用倾向得分进行协变量调整来估计治疗效果时,倾向得分模型的拟合优度诊断。
Pharmacoepidemiol Drug Saf. 2008 Dec;17(12):1202-17. doi: 10.1002/pds.1673.