Teresi Jeanne A, Ramirez Mildred, Lai Jin-Shei, Silver Stephanie
Columbia University Stroud Center, Faculty of Medicine and New York State Psychiatric Institute.
Psychol Sci Q. 2008;50(4):538.
Examination of the equivalence of measures involves several levels, including conceptual equivalence of meaning, as well as quantitative tests of differential item functioning (DIF). The purpose of this review is to examine DIF in patient-reported outcomes. Reviewed were measures of self-reported depression, quality of life (QoL) and general health. Most measures of depression contained large amounts of DIF, and the impact of DIF at the scale level was typically sizeable. The studies of QoL and health measures identified a moderate amount of DIF; however, many of these studies examined only one type of DIF (uniform). Relative to DIF analyses of depression measures, less analysis of the impact of DIF on QoL and health measures was performed, and the authors of these analyses generally did not recommend remedial action, with one notable exception. While these studies represent good beginning efforts to examine measurement equivalence in patient-reported outcome measures, more cross-validation work is required using other (often larger) samples of different ethnic and language groups, as well as other methods that permit more extensive analyses of the type of DIF, together with magnitude and impact.
对测量指标等效性的检验涉及多个层面,包括概念意义上的等效性,以及对差异项目功能(DIF)的定量测试。本综述的目的是检验患者报告结局中的DIF。回顾了自我报告的抑郁、生活质量(QoL)和总体健康状况的测量指标。大多数抑郁测量指标存在大量的DIF,且DIF在量表层面的影响通常相当大。生活质量和健康测量指标的研究发现了中等程度的DIF;然而,这些研究中的许多仅考察了一种类型的DIF(一致性)。相对于抑郁测量指标的DIF分析,对DIF对生活质量和健康测量指标影响的分析较少,并且这些分析的作者通常不建议采取补救措施,只有一个显著的例外。虽然这些研究是检验患者报告结局测量指标中测量等效性的良好开端,但需要使用不同种族和语言群体的其他(通常更大)样本进行更多的交叉验证工作,以及采用其他能够对DIF类型、大小和影响进行更广泛分析的方法。