认知测试中的测试偏差：认知能力筛查量表中的项目功能差异

Test bias in a cognitive test: differential item functioning in the CASI.

作者信息

Crane Paul K, van Belle Gerald, Larson Eric B

机构信息

Medicine and Public Health and Community Medicine, University of Washington, Seattle 98104, USA.

出版信息

Stat Med. 2004 Jan 30;23(2):241-56. doi: 10.1002/sim.1713.

DOI:10.1002/sim.1713

PMID:14716726

Abstract

Assessment of test bias is important to establish the construct validity of tests. Assessment of differential item functioning (DIF) is an important first step in this process. DIF is present when examinees from different groups have differing probabilities of success on an item, after controlling for overall ability level. Here, we present analysis of DIF in the Cognitive Assessment Screening Instrument (CASI) using data from a large cohort study of elderly adults. We developed an ordinal logistic regression modelling technique to assess test items for DIF. Estimates of cognitive ability were obtained in two ways based on responses to CASI items: using traditional CASI scoring according to the original test instructions as well as using item response theory (IRT) scoring. Several demographic characteristics were examined for potential DIF, including ethnicity and gender (entered into the model as dichotomous variables), and years of education and age (entered as continuous variables). We found that a disappointingly large number of items had DIF with respect to at least one of these demographic variables. More items were found to have DIF with traditional CASI scoring than with IRT scoring. This study demonstrates a powerful technique for the evaluation of DIF in psychometric tests. The finding that so many CASI items had DIF suggests that previous findings of differences between groups in cognitive functioning as measured by the CASI may be due to biased test items rather than true differences between groups. The finding that IRT scoring diminished the impact of DIF is discussed. Some preliminary suggestions for how to deal with items found to have DIF in cognitive tests are made. The advantages of the DIF detection techniques we developed are discussed in relation to other techniques for the evaluation of DIF.

摘要

评估测试偏差对于确立测试的结构效度很重要。差异项目功能（DIF）评估是这一过程中的重要第一步。当不同组的考生在控制总体能力水平后在某个项目上成功的概率不同时，就存在DIF。在此，我们使用来自一项针对老年人的大型队列研究的数据，对认知评估筛查工具（CASI）中的DIF进行分析。我们开发了一种有序逻辑回归建模技术来评估测试项目的DIF。基于对CASI项目的回答，通过两种方式获得认知能力估计值：按照原始测试说明使用传统的CASI评分以及使用项目反应理论（IRT）评分。我们考察了几个可能存在DIF的人口统计学特征，包括种族和性别（作为二分变量纳入模型），以及受教育年限和年龄（作为连续变量纳入）。我们发现，数量多得令人失望的项目在至少一个这些人口统计学变量方面存在DIF。与IRT评分相比，发现有更多项目在传统CASI评分时有DIF。本研究展示了一种用于评估心理测量测试中DIF的强大技术。如此多的CASI项目存在DIF这一发现表明，先前通过CASI测量的不同组在认知功能方面的差异发现可能是由于测试项目存在偏差，而非组间的真实差异。文中讨论了IRT评分减少DIF影响这一发现。针对如何处理在认知测试中发现存在DIF的项目提出了一些初步建议。我们开发的DIF检测技术的优势与其他DIF评估技术相关进行了讨论。

相似文献

Test bias in a cognitive test: differential item functioning in the CASI.认知测试中的测试偏差：认知能力筛查量表中的项目功能差异

Stat Med. 2004 Jan 30;23(2):241-56. doi: 10.1002/sim.1713.

A 37-item shoulder functional status item pool had negligible differential item functioning.一个包含37个项目的肩部功能状态项目库的项目功能差异可忽略不计。

J Clin Epidemiol. 2006 May;59(5):478-84. doi: 10.1016/j.jclinepi.2005.10.007. Epub 2006 Mar 14.

Differential item functioning on the Mini-Mental State Examination. An application of the Mantel-Haenszel and standardization procedures.简易精神状态检查表中的项目功能差异。Mantel-Haenszel法与标准化程序的应用。

Med Care. 2006 Nov;44(11 Suppl 3):S107-14. doi: 10.1097/01.mlr.0000245182.36914.4a.

Identification of differential item functioning using item response theory and the likelihood-based model comparison approach. Application to the Mini-Mental State Examination.使用项目反应理论和基于似然的模型比较方法识别差异项目功能。在简易精神状态检查表中的应用。

Med Care. 2006 Nov;44(11 Suppl 3):S134-42. doi: 10.1097/01.mlr.0000245251.83359.8c.

Differential item functioning (DIF) and the Mini-Mental State Examination (MMSE). Overview, sample, and issues of translation.项目功能差异（DIF）与简易精神状态检查表（MMSE）。概述、样本及翻译问题。

Med Care. 2006 Nov;44(11 Suppl 3):S95-S106. doi: 10.1097/01.mlr.0000245181.96133.db.

Differential item functioning analysis with ordinal logistic regression techniques. DIFdetect and difwithpar.使用有序逻辑回归技术进行项目功能差异分析。DIFdetect和difwithpar。

Med Care. 2006 Nov;44(11 Suppl 3):S115-23. doi: 10.1097/01.mlr.0000245183.28384.ed.

Validating a multiple mini-interview question bank assessing entry-level reasoning skills in candidates for graduate-entry medicine and dentistry programmes.验证一个多迷你面试题库，该题库用于评估申请研究生入学医学和牙科学项目的考生的入门级推理能力。

Med Educ. 2009 Apr;43(4):350-9. doi: 10.1111/j.1365-2923.2009.03292.x.

Demographic variation in SF-12 scores: true differences or differential item functioning?SF-12评分中的人口统计学差异：真实差异还是项目功能差异？

Med Care. 2003 Jul;41(7 Suppl):III75-III86. doi: 10.1097/01.MLR.0000076052.42628.CF.

Investigating differential item functioning by chronic diseases in the SF-36 health survey: a latent trait analysis using MIMIC models.在SF-36健康调查中研究慢性病的差异项目功能：使用MIMIC模型的潜在特质分析。

Med Care. 2007 Sep;45(9):851-9. doi: 10.1097/MLR.0b013e318074ce4c.

Assessment of differential item functioning.差异项目功能评估。

J Appl Meas. 2008;9(4):387-408.

引用本文的文献

Psychometric properties of the Flourishing Scale for South African first-year students.南非一年级学生蓬勃发展量表的心理测量特性。

Afr J Psychol Assess. 2023 Mar 24;5:130. doi: 10.4102/ajopa.v5i0.130. eCollection 2023.

Invariance and item bias of the Mental Health Continuum Short-Form for South African university first-year students.南非大学一年级学生心理健康连续体简表的不变性与项目偏差

Afr J Psychol Assess. 2024 Mar 26;6:143. doi: 10.4102/ajopa.v6i0.143. eCollection 2024.

Further Validation for a Measure of Disordered Eating in an Independent Sample of Male and Female Elite Athletes: The Athletic Disordered Eating (ADE) Scale.在男女精英运动员独立样本中对饮食失调测量方法的进一步验证：运动饮食失调（ADE）量表

Int J Eat Disord. 2025 Feb;58(2):400-410. doi: 10.1002/eat.24344. Epub 2024 Dec 4.

Detecting Differential Item Functioning Using Response Time.利用反应时间检测项目功能差异

Educ Psychol Meas. 2024 Oct 26:00131644241280400. doi: 10.1177/00131644241280400.

Evaluation of psychometric properties of the Dental Anxiety Inventory (DAI-36) questionnaire using iterative hybrid ordinal logistic: Differential item functioning (DIF).迭代混合有序逻辑：鉴别项目功能（DIF）评估牙科焦虑量表（DAI-36）问卷的心理计量特性。

Brain Behav. 2023 Sep;13(9):e3129. doi: 10.1002/brb3.3129. Epub 2023 Jul 17.

Cultural Differences in How People Deal with Ridicule and Laughter: Differential Item Functioning between the Taiwanese Chinese and Canadian English Versions of the PhoPhiKat-45.人们应对嘲笑和笑声的文化差异：PhoPhiKat - 45台湾中文版本与加拿大英文版本之间的项目功能差异

Eur J Investig Health Psychol Educ. 2023 Jan 20;13(2):238-258. doi: 10.3390/ejihpe13020019.

Comparison of unweighted and item response theory-based weighted sum scoring for the Nine-Questions Depression-Rating Scale in the Northern Thai Dialect.基于未加权和项目反应理论加权和评分的九问抑郁评定量表在泰北方言中的比较。

BMC Med Res Methodol. 2022 Oct 12;22(1):268. doi: 10.1186/s12874-022-01744-0.

Effects of Anonymity versus Examinee Name on a Measure of Depressive Symptoms in Adolescents.匿名与考生姓名对青少年抑郁症状测量的影响

Children (Basel). 2022 Jun 29;9(7):972. doi: 10.3390/children9070972.

Analysis of Race and Sex Bias in the Autism Diagnostic Observation Schedule (ADOS-2).自闭症诊断观察量表（ADOS-2）中的种族和性别偏见分析。

JAMA Netw Open. 2022 Apr 1;5(4):e229498. doi: 10.1001/jamanetworkopen.2022.9498.

Differential item functioning analysis on the Geriatric Depression Scale-15: An iterative hybrid ordinal logistic regression.老年抑郁量表-15的项目功能差异分析：一种迭代混合有序逻辑回归

Biomedicine (Taipei). 2021 Dec 1;11(4):23-34. doi: 10.37796/2211-8039.1098. eCollection 2021.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

认知测试中的测试偏差：认知能力筛查量表中的项目功能差异

Test bias in a cognitive test: differential item functioning in the CASI.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献