Department of Health Law, Policy, & Management, Boston University School of Public Health, Boston, MA, USA.
RTI International, Quality Measurement and Health Policy Program, 307 Waverly Oaks Road, Suite 101, Waltham, MA, 02452-8413, USA.
Qual Life Res. 2021 Jun;30(6):1757-1768. doi: 10.1007/s11136-021-02765-w. Epub 2021 Feb 21.
Sociodemographic characteristics may influence responses on self-reported measures. Differential item functioning (DIF) is when individuals expected to have the same ability level on a construct of interest have a different probability of endorsing an item on an item response theory (IRT) scale due to population characteristics. The goal of this study was to identify DIF for items in an outcome instrument by sociodemographic factors and, one controlling for DIF, assess true differences in function by those same factors.
The Work Disability Functional Assessment Battery 2.0 (WD-FAB 2.0) is an IRT-based self-reported measure of activity limitations relevant to work. Two samples from WD-FAB developed were used: 3793 SSA disability claimants randomly drawn from a pool of 16,500 claimants and a general sample if 2100 working age adults. We used a two-step IRT-based DIF method for three pairs of respondent characteristics: age, gender, and race/ethnicity, and calculated the weighted absolute difference between item characteristic curves. Independent two-group T-tests assessed differences in scores across groups.
Seventeen items displayed DIF. Men had higher scores than women on two physical and two mental function scales. Older respondents had lower physical and higher mental function scores. The lower education group had lower mental function scores.
DIF impacts function measurement and is important when assessing psychometric characteristics of instruments. Self-report measures should include diverse samples to conduct similar analyses. WD-FAB 2.0 scores are now reflections of function with reduced bias related to gender, race/ethnicity, or age.
社会人口统计学特征可能会影响自我报告测量的结果。当预期在感兴趣的结构上具有相同能力水平的个体由于人口特征而对项目反应理论(IRT)量表上的项目具有不同的肯定概率时,就会出现差异项目功能(DIF)。本研究的目的是通过社会人口统计学因素确定结果工具中项目的 DIF,并在这些相同因素的基础上评估功能的真实差异。
工作残疾功能评估电池 2.0(WD-FAB 2.0)是一种基于 IRT 的自我报告衡量与工作相关的活动受限的工具。使用 WD-FAB 开发的两个样本:从 16500 名索赔人中随机抽取的 3793 名 SSA 残疾索赔人和一组 2100 名处于工作年龄的成年人。我们使用基于两步 IRT 的 DIF 方法对三组受访者特征:年龄、性别和种族/民族进行了分析,并计算了项目特征曲线之间的加权绝对差异。独立的两组 T 检验评估了组间分数的差异。
有 17 个项目显示出 DIF。男性在两个身体和两个心理功能量表上的得分高于女性。年龄较大的受访者的身体功能得分较低,心理功能得分较高。受教育程度较低的群体的心理功能得分较低。
DIF 会影响功能测量,在评估工具的心理测量特征时非常重要。自我报告的衡量标准应包括不同的样本,以进行类似的分析。WD-FAB 2.0 的分数现在反映了功能,减少了与性别、种族/民族或年龄相关的偏差。