• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在心理测量测试中使用可解释机器学习进行项目功能差异检测。

Using Interpretable Machine Learning for Differential Item Functioning Detection in Psychometric Tests.

作者信息

Kraus Elisabeth Barbara, Wild Johannes, Hilbert Sven

机构信息

LMU Munich, Germany.

University of Regensburg, Germany.

出版信息

Appl Psychol Meas. 2024 Jul;48(4-5):167-186. doi: 10.1177/01466216241238744. Epub 2024 Mar 11.

DOI:10.1177/01466216241238744
PMID:39055539
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11268249/
Abstract

This study presents a novel method to investigate test fairness and differential item functioning combining psychometrics and machine learning. Test unfairness manifests itself in systematic and demographically imbalanced influences of confounding constructs on residual variances in psychometric modeling. Our method aims to account for resulting complex relationships between response patterns and demographic attributes. Specifically, it measures the importance of individual test items, and latent ability scores in comparison to a random baseline variable when predicting demographic characteristics. We conducted a simulation study to examine the functionality of our method under various conditions such as linear and complex impact, unfairness and varying number of factors, unfair items, and varying test length. We found that our method detects unfair items as reliably as Mantel-Haenszel statistics or logistic regression analyses but generalizes to multidimensional scales in a straight forward manner. To apply the method, we used random forests to predict migration backgrounds from ability scores and single items of an elementary school reading comprehension test. One item was found to be unfair according to all proposed decision criteria. Further analysis of the item's content provided plausible explanations for this finding. Analysis code is available at: https://osf.io/s57rw/?view_only=47a3564028d64758982730c6d9c6c547.

摘要

本研究提出了一种结合心理测量学和机器学习来调查测试公平性和项目功能差异的新方法。测试不公平性表现为在心理测量建模中,混杂结构对残差方差产生系统性的、人口统计学上不均衡的影响。我们的方法旨在解释反应模式与人口统计学属性之间由此产生的复杂关系。具体而言,在预测人口统计学特征时,它会测量各个测试项目以及潜在能力分数相对于随机基线变量的重要性。我们进行了一项模拟研究,以检验我们的方法在各种条件下的功能,如线性和复杂影响、不公平性以及不同数量的因素、不公平项目和不同的测试长度。我们发现,我们的方法检测不公平项目的可靠性与曼特尔 - 亨塞尔统计或逻辑回归分析相当,但能以一种直接的方式推广到多维量表。为了应用该方法,我们使用随机森林从小学阅读理解测试的能力分数和单个项目来预测移民背景。根据所有提出的决策标准,发现有一个项目是不公平的。对该项目内容的进一步分析为这一发现提供了合理的解释。分析代码可在以下网址获取:https://osf.io/s57rw/?view_only=47a3564028d64758982730c6d9c6c547 。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4de/11268249/4d8066fbd4f9/10.1177_01466216241238744-fig9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4de/11268249/22ab825e3099/10.1177_01466216241238744-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4de/11268249/367a7fdfa800/10.1177_01466216241238744-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4de/11268249/d44585987716/10.1177_01466216241238744-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4de/11268249/5bae6f63706f/10.1177_01466216241238744-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4de/11268249/3b1bc655d05a/10.1177_01466216241238744-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4de/11268249/79ba48994bf3/10.1177_01466216241238744-fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4de/11268249/882924b42184/10.1177_01466216241238744-fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4de/11268249/c19cd83efc63/10.1177_01466216241238744-fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4de/11268249/4d8066fbd4f9/10.1177_01466216241238744-fig9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4de/11268249/22ab825e3099/10.1177_01466216241238744-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4de/11268249/367a7fdfa800/10.1177_01466216241238744-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4de/11268249/d44585987716/10.1177_01466216241238744-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4de/11268249/5bae6f63706f/10.1177_01466216241238744-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4de/11268249/3b1bc655d05a/10.1177_01466216241238744-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4de/11268249/79ba48994bf3/10.1177_01466216241238744-fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4de/11268249/882924b42184/10.1177_01466216241238744-fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4de/11268249/c19cd83efc63/10.1177_01466216241238744-fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d4de/11268249/4d8066fbd4f9/10.1177_01466216241238744-fig9.jpg

相似文献

1
Using Interpretable Machine Learning for Differential Item Functioning Detection in Psychometric Tests.在心理测量测试中使用可解释机器学习进行项目功能差异检测。
Appl Psychol Meas. 2024 Jul;48(4-5):167-186. doi: 10.1177/01466216241238744. Epub 2024 Mar 11.
2
Psychometric and machine learning approaches for diagnostic assessment and tests of individual classification.用于诊断评估和个体分类测试的心理测量与机器学习方法。
Psychol Methods. 2021 Apr;26(2):236-254. doi: 10.1037/met0000317. Epub 2020 Jul 2.
3
[The estimation of premorbid intelligence levels in French speakers].[法语使用者病前智力水平的评估]
Encephale. 2005 Jan-Feb;31(1 Pt 1):31-43. doi: 10.1016/s0013-7006(05)82370-x.
4
A Comparison of Differential Item Functioning Detection Methods in Cognitive Diagnostic Models.认知诊断模型中差异项目功能检测方法的比较
Front Psychol. 2019 May 17;10:1137. doi: 10.3389/fpsyg.2019.01137. eCollection 2019.
5
Differential Item Functioning Analysis Without A Priori Information on Anchor Items: QQ Plots and Graphical Test.无锚点先验信息的项目区分度分析:QQ 图和图形检验。
Psychometrika. 2021 Jun;86(2):345-377. doi: 10.1007/s11336-021-09746-5. Epub 2021 Mar 3.
6
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
7
A Power Formula for the Mantel-Haenszel Test for Differential Item Functioning.用于项目功能差异的曼特尔-亨泽尔检验的功效公式。
Appl Psychol Meas. 2015 Jul;39(5):373-388. doi: 10.1177/0146621614568805. Epub 2015 Feb 5.
8
Improving the measurement of alexithymia in autistic adults: a psychometric investigation of the 20-item Toronto Alexithymia Scale and generation of a general alexithymia factor score using item response theory.改善自闭症成年人的述情障碍测量:多伦多述情障碍量表 20 项版本的心理计量学研究及使用项目反应理论生成一般述情障碍因子分数。
Mol Autism. 2021 Aug 10;12(1):56. doi: 10.1186/s13229-021-00463-5.
9
Vegetable parenting practices scale. Item response modeling analyses.蔬菜育儿实践量表。项目反应模型分析。
Appetite. 2015 Aug;91:190-9. doi: 10.1016/j.appet.2015.04.048. Epub 2015 Apr 17.
10
Psychometric Evaluation of an Instrument to Measure Prospective Pregnancy Preferences: The Desire to Avoid Pregnancy Scale.测量预期妊娠偏好的工具的心理计量学评估:避免妊娠愿望量表。
Med Care. 2019 Feb;57(2):152-158. doi: 10.1097/MLR.0000000000001048.

本文引用的文献

1
From fair predictions to just decisions? Conceptualizing algorithmic fairness and distributive justice in the context of data-driven decision-making.从合理预测到公正决策?在数据驱动决策背景下对算法公平性和分配正义进行概念化
Front Sociol. 2022 Oct 10;7:883999. doi: 10.3389/fsoc.2022.883999. eCollection 2022.
2
Simplifying the Assessment of Measurement Invariance over Multiple Background Variables: Using Regularized Moderated Nonlinear Factor Analysis to Detect Differential Item Functioning.简化对多个背景变量测量不变性的评估:使用正则化调节非线性因子分析检测项目功能差异
Struct Equ Modeling. 2020;27(1):43-55. doi: 10.1080/10705511.2019.1642754. Epub 2019 Sep 5.
3
One model to rule them all? Using machine learning algorithms to determine the number of factors in exploratory factor analysis.
一统天下的模式?使用机器学习算法确定探索性因素分析中的因素数量。
Psychol Methods. 2020 Dec;25(6):776-786. doi: 10.1037/met0000262. Epub 2020 Mar 5.
4
Improving the assessment of measurement invariance: Using regularization to select anchor items and identify differential item functioning.改进测量不变性评估:使用正则化选择锚定项目并识别差异项目功能。
Psychol Methods. 2020 Dec;25(6):673-690. doi: 10.1037/met0000253. Epub 2020 Jan 9.
5
Rasch Trees: A New Method for Detecting Differential Item Functioning in the Rasch Model.拉施树:一种检测拉施模型中项目功能差异的新方法。
Psychometrika. 2015 Jun;80(2):289-316. doi: 10.1007/s11336-013-9388-3. Epub 2013 Dec 19.
6
A penalty approach to differential item functioning in Rasch models.拉施模型中项目功能差异的惩罚方法。
Psychometrika. 2015 Mar;80(1):21-43. doi: 10.1007/s11336-013-9377-6. Epub 2013 Dec 3.
7
A general framework and an R package for the detection of dichotomous differential item functioning.一种用于检测二分类差异项目功能的通用框架和 R 包。
Behav Res Methods. 2010 Aug;42(3):847-62. doi: 10.3758/BRM.42.3.847.
8
An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests.递归分区介绍:分类和回归树、装袋和随机森林的原理、应用和特点。
Psychol Methods. 2009 Dec;14(4):323-48. doi: 10.1037/a0016973.
9
Conditional variable importance for random forests.随机森林的条件变量重要性
BMC Bioinformatics. 2008 Jul 11;9:307. doi: 10.1186/1471-2105-9-307.