National Research Centre for the Working Environment, Copenhagen, Denmark.
Scand J Public Health. 2010 Feb;38(3 Suppl):90-105. doi: 10.1177/1403494809352533.
To evaluate the construct validity of the Copenhagen Psychosocial Questionnaire II (COPSOQ II) by means of tests for differential item functioning (DIF) and differential item effect (DIE).
We used a Danish general population postal survey (n = 4,732 with 3,517 wage earners) with a one-year register based follow up for long-term sickness absence. DIF was evaluated against age, gender, education, social class, public/private sector employment, and job type using ordinal logistic regression. DIE was evaluated against job satisfaction and self-rated health (using ordinal logistic regression), against depressive symptoms, burnout, and stress (using multiple linear regression), and against long-term sick leave (using a proportional hazards model). We used a cross-validation approach to counter the risk of significant results due to multiple testing.
Out of 1,052 tests, we found 599 significant instances of DIF/DIE, 69 of which showed both practical and statistical significance across two independent samples. Most DIF occurred for job type (in 20 cases), while we found little DIF for age, gender, education, social class and sector. DIE seemed to pertain to particular items, which showed DIE in the same direction for several outcome variables.
The results allowed a preliminary identification of items that have a positive impact on construct validity and items that have negative impact on construct validity. These results can be used to develop better shortform measures and to improve the conceptual framework, items and scales of the COPSOQ II.
We conclude that tests of DIF and DIE are useful for evaluating construct validity.
通过测试差异项目功能(DIF)和差异项目效应(DIE),评估哥本哈根心理社会问卷 II(COPSOQ II)的结构效度。
我们使用了丹麦一般人群的邮寄调查(n=4732,其中3517 人为工薪阶层),并对其进行了为期一年的基于登记的长期病假随访。使用有序逻辑回归对 DIF 进行评估,评估因素包括年龄、性别、教育、社会阶层、公共/私营部门就业和工作类型。使用有序逻辑回归评估 DIE 与工作满意度和自评健康状况(使用有序逻辑回归)、与抑郁症状、倦怠和压力(使用多元线性回归)以及与长期病假(使用比例风险模型)的关系。我们使用交叉验证方法来防止由于多次测试而导致的显著结果的风险。
在 1052 次测试中,我们发现了 599 次显著的 DIF/DIE 实例,其中 69 次在两个独立样本中表现出实际和统计学意义。大多数 DIF 发生在工作类型上(20 例),而我们发现年龄、性别、教育、社会阶层和部门的 DIF 很少。DIE 似乎与特定项目有关,这些项目在几个结果变量中表现出相同方向的 DIE。
结果初步确定了对结构效度有积极影响的项目和对结构效度有负面影响的项目。这些结果可用于开发更好的短式测量方法,并改进 COPSOQ II 的概念框架、项目和量表。
我们得出结论,DIF 和 DIE 的测试对于评估结构效度是有用的。