• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

运用语料库分析辅助处理差异项目功能(DIF)解释:标准化写作评估中的性别差异

Using Corpus Analyses to Help Address the DIF Interpretation: Gender Differences in Standardized Writing Assessment.

作者信息

Li Zhi, Chen Michelle Y, Banerjee Jayanti

机构信息

Department of Linguistics, College of Arts and Science, University of Saskatchewan, Saskatoon, SK, Canada.

Paragon Testing Enterprises, Vancouver, BC, Canada.

出版信息

Front Psychol. 2020 Jun 3;11:1088. doi: 10.3389/fpsyg.2020.01088. eCollection 2020.

DOI:10.3389/fpsyg.2020.01088
PMID:32581944
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7283922/
Abstract

Addressing differential item functioning (DIF) provides validity evidence to support the interpretation of test scores across groups. Conventional DIF methods flag DIF items statistically, but often fail to consolidate a substantive interpretation. The lack of interpretability of DIF results is particularly pronounced in writing assessment where the matching of test takers' proficiency levels often relies on external variables and the reported DIF effect is frequently small in magnitude. Using responses to a prompt that showed small gender DIF favoring female test takers, we demonstrate a corpus-based approach that helps address DIF interpretation. To provide linguistic insights into the possible sources of the small DIF effect, this study compared a gender-balanced corpus of 826 writing samples matched by test takers' performance on the reading and listening components of the test. Four groups of linguistic features that correspond to the rating dimensions, and thus partially represent the writing construct were analyzed. They include (1) sentiment and social cognition, (2) cohesion, (3) syntactic features, and (4) lexical features. After initial screening, 123 linguistic features, all of which were correlated with the writing scores, were retained for gender comparison. Among these selected features, female test takers' writing samples scored higher on six of them with small effect sizes in the categories of cohesion and syntactic features. Three of the six features were positively correlated with higher writing scores, while the other three were negative. These results are largely consistent with previous findings of gender differences in language use. Additionally, the small differences in the language features of the writing samples (in terms of the small number of features that differ between genders and the small effect size of the observed differences) are consistent with the previous DIF results, both suggesting that the effect of gender differences on the writing scores is likely to be very small. In sum, the corpus-based findings provide linguistic insights into the gender-related language differences and their potential consequences in a testing context. These findings are meaningful for furthering our understanding of the small gender DIF effect identified through statistical analysis, which lends support to the validity of writing scores.

摘要

处理项目功能差异(DIF)可为跨群体测试分数的解释提供效度证据。传统的DIF方法通过统计手段标记出存在DIF的项目,但往往无法给出实质性的解释。DIF结果缺乏可解释性在写作评估中尤为明显,因为考生熟练程度的匹配通常依赖外部变量,而且报告的DIF效应往往幅度较小。通过对一道显示出有利于女性考生的微小性别DIF的题目作答情况,我们展示了一种基于语料库的方法,有助于解决DIF解释问题。为了从语言角度深入了解微小DIF效应的可能来源,本研究比较了一个由826篇写作样本组成的性别均衡语料库,这些样本在考生的阅读和听力部分成绩上进行了匹配。分析了与评分维度相对应、从而部分代表写作结构的四组语言特征。它们包括:(1)情感与社会认知,(2)衔接,(3)句法特征,以及(4)词汇特征。经过初步筛选,保留了123个与写作分数相关的语言特征用于性别比较。在这些选定的特征中,女性考生的写作样本在其中六个特征上得分更高,在衔接和句法特征类别中效应量较小。这六个特征中有三个与较高的写作分数呈正相关,另外三个呈负相关。这些结果在很大程度上与先前关于语言使用中性别差异的研究结果一致。此外,写作样本语言特征的微小差异(就性别之间存在差异的特征数量少以及观察到的差异效应量小而言)与先前的DIF结果一致,两者都表明性别差异对写作分数的影响可能非常小。总之,基于语料库的研究结果为与性别相关的语言差异及其在测试情境中的潜在影响提供了语言方面的见解。这些发现对于深化我们对通过统计分析确定的微小性别DIF效应的理解具有重要意义,这为写作分数的效度提供了支持。

相似文献

1
Using Corpus Analyses to Help Address the DIF Interpretation: Gender Differences in Standardized Writing Assessment.运用语料库分析辅助处理差异项目功能(DIF)解释:标准化写作评估中的性别差异
Front Psychol. 2020 Jun 3;11:1088. doi: 10.3389/fpsyg.2020.01088. eCollection 2020.
2
Evaluating measurement equivalence using the item response theory log-likelihood ratio (IRTLR) method to assess differential item functioning (DIF): applications (with illustrations) to measures of physical functioning ability and general distress.使用项目反应理论对数似然比(IRTLR)方法评估测量等价性,以评估项目功能差异(DIF):身体功能能力和一般痛苦测量的应用(附说明)
Qual Life Res. 2007;16 Suppl 1:43-68. doi: 10.1007/s11136-007-9186-4. Epub 2007 May 5.
3
A Propensity Score Method for Investigating Differential Item Functioning in Performance Assessment.一种用于研究绩效评估中项目功能差异的倾向得分方法。
Educ Psychol Meas. 2020 Jun;80(3):476-498. doi: 10.1177/0013164419878861. Epub 2019 Oct 4.
4
Measurement Equivalence of the Patient Reported Outcomes Measurement Information System (PROMIS) Anxiety Short Forms in Ethnically Diverse Groups.患者报告结局测量信息系统(PROMIS)焦虑简表在不同种族群体中的测量等效性
Psychol Test Assess Model. 2016;58(1):183-219.
5
Measurement Equivalence of the Patient Reported Outcomes Measurement Information System (PROMIS) Pain Interference Short Form Items: Application to Ethnically Diverse Cancer and Palliative Care Populations.患者报告结局测量信息系统(PROMIS)疼痛干扰简表条目的测量等效性:在不同种族癌症和姑息治疗人群中的应用。
Psychol Test Assess Model. 2016;58(2):309-352.
6
Grade-Related Differential Item Functioning in General English Proficiency Test-Kids Listening.通用英语能力测试-儿童听力中与年级相关的差异项目功能
Front Psychol. 2021 Nov 25;12:767244. doi: 10.3389/fpsyg.2021.767244. eCollection 2021.
7
Psychometric Properties and Performance of the Patient Reported Outcomes Measurement Information System (PROMIS) Depression Short Forms in Ethnically Diverse Groups.患者报告结局测量信息系统(PROMIS)抑郁简表在不同种族群体中的心理测量特性及表现
Psychol Test Assess Model. 2016;58(1):141-181.
8
Chinese College Test Takers' Individual Differences and Reading Test Performance: A Structural Equation Modeling Approach.中国大学考生的个体差异与阅读测试表现:一种结构方程建模方法。
Percept Mot Skills. 2016 Jun;122(3):725-41. doi: 10.1177/0031512516648131. Epub 2016 May 11.
9
[The estimation of premorbid intelligence levels in French speakers].[法语使用者病前智力水平的评估]
Encephale. 2005 Jan-Feb;31(1 Pt 1):31-43. doi: 10.1016/s0013-7006(05)82370-x.
10
Multiple, correlated covariates associated with differential item functioning (DIF): Accounting for language DIF when education levels differ across languages.与差异项目功能(DIF)相关的多个相关协变量:当不同语言间教育水平存在差异时对语言DIF的考量。
Ageing Res. 2011 Apr 28;2(1):19-25. doi: 10.4081/ar.2011.e4.

引用本文的文献

1
Examining Linguistic Differences in Electronic Health Records for Diverse Patients With Diabetes: Natural Language Processing Analysis.探究不同糖尿病患者电子健康记录中的语言差异:自然语言处理分析
JMIR Med Inform. 2024 May 23;12:e50428. doi: 10.2196/50428.

本文引用的文献

1
A Propensity Score Method for Investigating Differential Item Functioning in Performance Assessment.一种用于研究绩效评估中项目功能差异的倾向得分方法。
Educ Psychol Meas. 2020 Jun;80(3):476-498. doi: 10.1177/0013164419878861. Epub 2019 Oct 4.
2
Sentiment Analysis and Social Cognition Engine (SEANCE): An automatic tool for sentiment, social cognition, and social-order analysis.情感分析和社会认知引擎(SEANCE):一种用于情感、社会认知和社会秩序分析的自动工具。
Behav Res Methods. 2017 Jun;49(3):803-821. doi: 10.3758/s13428-016-0743-z.
3
The tool for the automatic analysis of text cohesion (TAACO): Automatic assessment of local, global, and text cohesion.文本衔接自动分析工具(TAACO):局部、全局及文本衔接的自动评估
Behav Res Methods. 2016 Dec;48(4):1227-1237. doi: 10.3758/s13428-015-0651-7.