Department of Community Health Sciences, University of Manitoba, S113-750 Bannatyne Avenue, Winnipeg, MB, R3E 0W3, Canada.
Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
Qual Life Res. 2022 Sep;31(9):2837-2848. doi: 10.1007/s11136-022-03129-8. Epub 2022 Apr 7.
Item non-response (i.e., missing data) may mask the detection of differential item functioning (DIF) in patient-reported outcome measures or result in biased DIF estimates. Non-response can be challenging to address in ordinal data. We investigated an unsupervised machine-learning method for ordinal item-level imputation and compared it with commonly-used item non-response methods when testing for DIF.
Computer simulation and real-world data were used to assess several item non-response methods using the item response theory likelihood ratio test for DIF. The methods included: (a) list-wise deletion (LD), (b) half-mean imputation (HMI), (c) full information maximum likelihood (FIML), and (d) non-negative matrix factorization (NNMF), which adopts a machine-learning approach to impute missing values. Control of Type I error rates were evaluated using a liberal robustness criterion for α = 0.05 (i.e., 0.025-0.075). Statistical power was assessed with and without adoption of an item non-response method; differences > 10% were considered substantial.
Type I error rates for detecting DIF using LD, FIML and NNMF methods were controlled within the bounds of the robustness criterion for > 95% of simulation conditions, although the NNMF occasionally resulted in inflated rates. The HMI method always resulted in inflated error rates with 50% missing data. Differences in power to detect moderate DIF effects for LD, FIML and NNMF methods were substantial with 50% missing data and otherwise insubstantial.
The NNMF method demonstrated comparable performance to commonly-used non-response methods. This computationally-efficient method represents a promising approach to address item-level non-response when testing for DIF.
项目无应答(即缺失数据)可能会掩盖患者报告结局测量中的差异项目功能(DIF)的检测,或者导致有偏差的 DIF 估计。在有序数据中,无应答可能难以解决。我们研究了一种用于有序项目级插补的无监督机器学习方法,并在测试 DIF 时将其与常用的项目无应答方法进行了比较。
使用项目反应理论似然比检验对几种项目无应答方法进行了计算机模拟和真实数据评估,用于 DIF 的测试。这些方法包括:(a)逐项删除(LD),(b)半均值插补(HMI),(c)完全信息最大似然(FIML)和(d)非负矩阵分解(NNMF),它采用机器学习方法来插补缺失值。使用宽松的稳健性标准(即 0.025-0.075)评估 α=0.05 的Ⅰ类错误率控制。评估了在采用和不采用项目无应答方法的情况下的统计功效;差异>10%被认为是显著的。
使用 LD、FIML 和 NNMF 方法检测 DIF 的Ⅰ类错误率在稳健性标准范围内得到控制,对于>95%的模拟条件,尽管 NNMF 偶尔会导致膨胀的速率。对于 50%的缺失数据,HMI 方法总是导致膨胀的错误率。对于 LD、FIML 和 NNMF 方法,在 50%的缺失数据下,检测中度 DIF 效应的功效差异很大,而在其他情况下则不显著。
NNMF 方法与常用的无应答方法表现相当。当测试 DIF 时,这种计算效率高的方法代表了解决项目级无应答的有前途的方法。