Department of Medical Social Sciences, Northwestern University Feinberg School of Medicine, Chicago, IL, USA.
Department of Epidemiology and Data Science, Amsterdam UMC, Vrije Universiteit, Amsterdam, The Netherlands.
Value Health. 2023 Oct;26(10):1518-1524. doi: 10.1016/j.jval.2023.06.002. Epub 2023 Jun 12.
This study aimed to examine the ability of classical test theory (CTT) and item response theory (IRT) scores assessed by Patient-Reported Outcomes Measurement Information System® (PROMIS®) measures to identify significant individual changes in the setting of clinical studies, using both simulated and empirical data.
We used simulated data to compare the estimation of significant individual changes between CTT and IRT scores across different conditions and a clinical trial data set to verify the simulation results. We calculated reliable change indexes to estimate significant individual changes.
For small true change, IRT scores showed a slightly higher rate of classifying change groups than CTT scores and were comparable with CTT scores for a shorter test length. Additionally, IRT scores were found to have a prominent advantage in the classification rates of change groups for medium to high true change over CTT scores. Such an advantage became prominent in a longer test length. The empirical data analysis results using an anchor-based approach further supported the above findings that IRT scores can more accurately classify participants into change groups than CTT scores.
Given that IRT scores perform better, or at least comparably, in most conditions, we recommend using IRT scores to estimate significant individual changes and identify responders to treatment. This study provides evidence-based guidance in detecting individual changes based on CTT and IRT scores under various measurement conditions and leads to recommendations for identifying responders to treatment for participants in clinical trials.
本研究旨在使用患者报告结局测量信息系统(PROMIS®)测量的经典测试理论(CTT)和项目反应理论(IRT)评分,通过模拟和实证数据,考察其在临床研究环境中识别个体显著变化的能力。
我们使用模拟数据比较了不同条件下 CTT 和 IRT 评分在个体显著变化估计方面的表现,并通过临床试验数据集验证了模拟结果。我们计算了可靠变化指数来估计个体的显著变化。
对于较小的真实变化,IRT 评分比 CTT 评分略高,且在测试长度较短的情况下与 CTT 评分相当。此外,对于中到高真实变化的变化组分类,IRT 评分相对于 CTT 评分具有明显的优势。这种优势在测试长度较长时更为显著。基于锚定的实证数据分析结果进一步支持了上述发现,即 IRT 评分比 CTT 评分更能准确地将参与者分类为变化组。
鉴于 IRT 评分在大多数情况下表现更好,或者至少相当,我们建议使用 IRT 评分来估计个体的显著变化,并识别治疗的应答者。本研究为在各种测量条件下基于 CTT 和 IRT 评分检测个体变化提供了循证指导,并为临床试验中识别治疗应答者的参与者提供了建议。