Lee Minji K, Beebe Timothy J, Yost Kathleen J, Eton David T, Novotny Paul J, Dueck Amylou C, Frost Marlene, Sloan Jeff A
Department of Quantitative Health Sciences, Mayo Clinic, 200 First St SW, Rochester, MN, 55905, USA.
Division of Health Policy and Management, University of Minnesota School of Public Health, 625 Michigan Ave, 27th Floor, Chicago, IL, 60611, USA.
J Patient Rep Outcomes. 2021 Sep 17;5(1):95. doi: 10.1186/s41687-021-00368-0.
The study tests the effects of data collection modes on patient responses associated with the multi-item measures such as Patient-Reported Outcomes Measurement System (PROMIS), and single-item measures such as Patient-Reported Outcomes version of the Common Terminology Criteria for Adverse Events (PRO-CTCAE), and Numerical Rating Scale (NRS) measures.
Adult cancer patients were recruited from five cancer centers and administered measures of anxiety, depression, fatigue, sleep disturbance, pain intensity, pain interference, ability to participate in social roles and activities, global mental and physical health, and physical function. Patients were randomized to complete the measures on paper (595), interactive voice response (IVR, 596) system, or tablet computer (589). We evaluated differential item functioning (DIF) by method of data collection using the R software package, lordif. For constructs that showed no DIF, we concluded equivalence across modes if the equivalence margin, defined as ± 0.20 × pooled SD, completely surrounds 95% confidence intervals (CI's) for difference in mean score. If the 95% CI fell totally outside the equivalence margin, we concluded systematic score difference by modes. If the 95% CI partly overlaps the equivalence margin, we concluded neither equivalence nor difference.
For all constructs, no DIF of any kind was found for the three modes. The scores on paper and tablet were more comparable than between IVR and other modes but none of the 95% CI's were completely outside the equivalence margins, in which we established neither equivalence nor difference. Percentages of missing values were comparable for paper and tablet modes. Percentages of missing values were higher for IVR (2.3% to 6.5% depending on measures) compared to paper and tablet modes (0.7% to 3.3% depending on measures and modes), which was attributed to random technical difficulties experienced in some centers.
Across all mode comparisons, there were some measures with CI's not completely contained within the margin of small effect. Two visual modes agreed more than visual-auditory pairs. IVR may induce differences in scores unrelated to constructs being measured in comparison with paper and tablet. The users of the surveys should consider using IVR only when paper and computer administration is not feasible.
本研究测试了数据收集方式对与多项测量指标(如患者报告结局测量系统(PROMIS))、单项测量指标(如不良事件通用术语标准患者报告结局版(PRO-CTCAE))以及数字评定量表(NRS)测量指标相关的患者反应的影响。
从五个癌症中心招募成年癌症患者,并对其进行焦虑、抑郁、疲劳、睡眠障碍、疼痛强度、疼痛干扰、参与社会角色和活动的能力、整体心理和身体健康以及身体功能的测量。患者被随机分为在纸质问卷(595例)、交互式语音应答(IVR,596例)系统或平板电脑(589例)上完成测量。我们使用R软件包lordif通过数据收集方法评估项目功能差异(DIF)。对于未显示DIF的结构,如果等效边际(定义为±0.20×合并标准差)完全包围平均得分差异的95%置信区间(CI),我们得出各方式间等效的结论。如果95%CI完全落在等效边际之外,我们得出各方式间存在系统得分差异的结论。如果95%CI部分与等效边际重叠,我们得出既不等效也无差异的结论。
对于所有结构,三种方式均未发现任何类型的DIF。纸质问卷和平板电脑上的得分比IVR与其他方式之间的得分更具可比性,但95%CI均未完全落在等效边际之外,因此我们既未确定等效性也未确定差异。纸质问卷和平板电脑方式的缺失值百分比具有可比性。与纸质问卷和平板电脑方式(根据测量指标和方式,缺失值百分比为0.7%至3.3%)相比,IVR的缺失值百分比更高(根据测量指标,为2.3%至6.5%),这归因于一些中心遇到的随机技术困难。
在所有方式比较中,有些测量指标CI未完全包含在小效应边际内。两种视觉方式之间的一致性高于视觉-听觉方式组合。与纸质问卷和平板电脑相比,IVR可能会在与所测量结构无关的得分上产生差异。调查使用者应仅在纸质问卷和电脑作答不可行时才考虑使用IVR。