Suppr超能文献

使用项目反应理论对数似然比(IRTLR)方法评估测量等价性,以评估项目功能差异(DIF):身体功能能力和一般痛苦测量的应用(附说明)

Evaluating measurement equivalence using the item response theory log-likelihood ratio (IRTLR) method to assess differential item functioning (DIF): applications (with illustrations) to measures of physical functioning ability and general distress.

作者信息

Teresi Jeanne A, Ocepek-Welikson Katja, Kleinman Marjorie, Cook Karon F, Crane Paul K, Gibbons Laura E, Morales Leo S, Orlando-Edelen Maria, Cella David

机构信息

Faculty of Medicine, New York State Psychiatric Institute, Columbia University Stroud Center, New York, NY, USA.

出版信息

Qual Life Res. 2007;16 Suppl 1:43-68. doi: 10.1007/s11136-007-9186-4. Epub 2007 May 5.

Abstract

BACKGROUND

Methods based on item response theory (IRT) that can be used to examine differential item functioning (DIF) are illustrated. An IRT-based approach to the detection of DIF was applied to physical function and general distress item sets. DIF was examined with respect to gender, age and race. The method used for DIF detection was the item response theory log-likelihood ratio (IRTLR) approach. DIF magnitude was measured using the differences in the expected item scores, expressed as the unsigned probability differences, and calculated using the non-compensatory DIF index (NCDIF). Finally, impact was assessed using expected scale scores, expressed as group differences in the total test (measure) response functions.

METHODS

The example for the illustration of the methods came from a study of 1,714 patients with cancer or HIV/AIDS. The measure contained 23 items measuring physical functioning ability and 15 items addressing general distress, scored in the positive direction.

RESULTS

The substantive findings were of relatively small magnitude DIF. In total, six items showed relatively larger magnitude (expected item score differences greater than the cutoff) of DIF with respect to physical function across the three comparisons: "trouble with a long walk" (race), "vigorous activities" (race, age), "bending, kneeling stooping" (age), "lifting or carrying groceries" (race), "limited in hobbies, leisure" (age), "lack of energy" (race). None of the general distress items evidenced high magnitude DIF; although "worrying about dying" showed some DIF with respect to both age and race, after adjustment.

CONCLUSIONS

The fact that many physical function items showed DIF with respect to age, even after adjustment for multiple comparisons, indicates that the instrument may be performing differently for these groups. While the magnitude and impact of DIF at the item and scale level was minimal, caution should be exercised in the use of subsets of these items, as might occur with selection for clinical decisions or computerized adaptive testing. The issues of selection of anchor items, and of criteria for DIF detection, including the integration of significance and magnitude measures remain as issues requiring investigation. Further research is needed regarding the criteria and guidelines appropriate for DIF detection in the context of health-related items.

摘要

背景

阐述了基于项目反应理论(IRT)可用于检验项目功能差异(DIF)的方法。一种基于IRT的DIF检测方法应用于身体功能和一般困扰项目集。针对性别、年龄和种族对DIF进行了检验。用于DIF检测的方法是项目反应理论对数似然比(IRTLR)方法。使用预期项目得分的差异来衡量DIF大小,以无符号概率差异表示,并使用非补偿性DIF指数(NCDIF)进行计算。最后,使用预期量表得分评估影响,以总测试(测量)反应函数中的组间差异表示。

方法

用于说明这些方法的示例来自一项对1714名癌症或艾滋病毒/艾滋病患者的研究。该测量包含23个测量身体功能能力的项目和15个涉及一般困扰的项目,得分呈正向。

结果

实质性发现是DIF的大小相对较小。总体而言,在三项比较中,共有六个项目在身体功能方面表现出相对较大的DIF(预期项目得分差异大于临界值):“长距离行走困难”(种族)、“剧烈活动”(种族、年龄)、“弯腰、跪、蹲”(年龄)、“提或搬杂货”(种族)、“爱好、休闲受限”(年龄)、“缺乏精力”(种族)。没有一个一般困扰项目显示出高大小的DIF;尽管“担心死亡”在调整后在年龄和种族方面都显示出一些DIF。

结论

即使在对多重比较进行调整之后,许多身体功能项目在年龄方面仍显示出DIF,这一事实表明该工具在这些群体中的表现可能有所不同。虽然项目和量表层面DIF的大小和影响最小,但在使用这些项目的子集时应谨慎,如在临床决策选择或计算机自适应测试中可能出现的情况。锚定项目的选择问题以及DIF检测标准,包括显著性和大小测量的整合,仍然是需要研究的问题。在健康相关项目的背景下,需要进一步研究适用于DIF检测的标准和指南。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验