Dept. of Methods in Empirical Social Research, Technische Universität Dresden, Dresden, Germany.
Charité - Universitätsmedizin Berlin, Institute of International Health, Berlin, Germany.
BMC Med Res Methodol. 2023 Sep 28;23(1):213. doi: 10.1186/s12874-023-02015-2.
Configural, metric, and scalar measurement invariance have been indicators of bias-free statistical cross-group comparisons, although they are difficult to verify in the data. Low comparability of translated questionnaires or the different understanding of response formats by respondents might lead to rejection of measurement invariance and point to comparability bias in multi-language surveys. Anchoring vignettes have been proposed as a method to control for the different understanding of response categories by respondents (the latter is referred to as differential item functioning related to response categories or rating scales: RC-DIF). We evaluate the question whether the cross-cultural comparability of data can be assured by means of anchoring vignettes or by considering socio-demographic heterogeneity as an alternative approach.
We used the Health System Responsiveness (HSR) questionnaire and collected survey data in English (n = 183) and Arabic (n = 121) in a random sample of refugees in the third largest German federal state. We conducted multiple-group Confirmatory Factor Analyses (MGCFA) to analyse measurement invariance and compared the results when 1) using rescaled data on the basis of anchoring vignettes (non-parametric approach), 2) including information on RC-DIF from the analyses with anchoring vignettes as covariates (parametric approach) and 3) including socio-demographic covariates.
For the HSR, every level of measurement invariance between the Arabic and English languages was rejected. Implementing rescaling or modelling on the basis of anchoring vignettes provided superior results over the initial MGCFA analysis, since configural, metric and - for ordered categorical analyses-scalar invariance could not be rejected. A consideration of socio-demographic variables did not show such an improvement.
Surveys may consider anchoring vignettes as a method to assess cross-cultural comparability of data, whereas socio-demographic variables cannot be used to improve data comparability as a standalone method. More research on the efficient implementation of anchoring vignettes and further development of methods to incorporate them when modelling measurement invariance is needed.
尽管在数据中很难验证,但配置、度量和标度测量不变性一直是无偏统计跨组比较的指标。翻译后的问卷可比性低或受访者对响应格式的不同理解可能导致测量不变性的拒绝,并指出多语言调查中的可比性偏差。锚定情境已被提议作为一种控制受访者对响应类别不同理解的方法(后者被称为与响应类别或评分量表相关的差异项目功能:RC-DIF)。我们评估了通过锚定情境或考虑社会人口异质性作为替代方法是否可以确保数据的跨文化可比性的问题。
我们使用健康系统响应性(HSR)问卷,并在德国第三大联邦州的难民随机样本中以英语(n=183)和阿拉伯语(n=121)收集调查数据。我们进行了多组验证性因素分析(MGCFA)来分析测量不变性,并比较了以下三种情况下的结果:1)使用基于锚定情境的重新缩放数据(非参数方法),2)将来自带有锚定情境的分析的 RC-DIF 信息作为协变量(参数方法),3)包含社会人口学协变量。
对于 HSR,阿拉伯语和英语之间的每种测量不变性水平都被拒绝。基于锚定情境的缩放或建模实施提供了优于初始 MGCFA 分析的结果,因为无法拒绝配置、度量和 - 对于有序分类分析 - 标度不变性。考虑社会人口学变量并没有显示出这样的改进。
调查可以考虑将锚定情境作为评估数据跨文化可比性的一种方法,而社会人口学变量不能作为单独的方法来提高数据可比性。需要进一步研究锚定情境的有效实施,并进一步开发在建模测量不变性时纳入它们的方法。