关于经典测量理论（CTT）和项目反应理论（IRT）方法在两组患者间比较患者报告结局的功效的方法学问题——一项模拟研究。

Methodological issues regarding power of classical test theory (CTT) and item response theory (IRT)-based approaches for the comparison of patient-reported outcomes in two groups of patients--a simulation study.

机构信息

EA 4275 Biostatistique, recherche clinique et mesures subjectives en santé, Faculté de Pharmacie, Université de Nantes, 1 rue Gaston Veil, 44035 Nantes Cedex 1, France.

出版信息

BMC Med Res Methodol. 2010 Mar 25;10:24. doi: 10.1186/1471-2288-10-24.

DOI:10.1186/1471-2288-10-24

PMID:20338031

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2858729/

Abstract

BACKGROUND

Patients-Reported Outcomes (PRO) are increasingly used in clinical and epidemiological research. Two main types of analytical strategies can be found for these data: classical test theory (CTT) based on the observed scores and models coming from Item Response Theory (IRT). However, whether IRT or CTT would be the most appropriate method to analyse PRO data remains unknown. The statistical properties of CTT and IRT, regarding power and corresponding effect sizes, were compared.

METHODS

Two-group cross-sectional studies were simulated for the comparison of PRO data using IRT or CTT-based analysis. For IRT, different scenarios were investigated according to whether items or person parameters were assumed to be known, to a certain extent for item parameters, from good to poor precision, or unknown and therefore had to be estimated. The powers obtained with IRT or CTT were compared and parameters having the strongest impact on them were identified.

RESULTS

When person parameters were assumed to be unknown and items parameters to be either known or not, the power achieved using IRT or CTT were similar and always lower than the expected power using the well-known sample size formula for normally distributed endpoints. The number of items had a substantial impact on power for both methods.

CONCLUSION

Without any missing data, IRT and CTT seem to provide comparable power. The classical sample size formula for CTT seems to be adequate under some conditions but is not appropriate for IRT. In IRT, it seems important to take account of the number of items to obtain an accurate formula.

摘要

背景

患者报告的结局（PRO）越来越多地用于临床和流行病学研究。这些数据有两种主要的分析策略：基于观察得分的经典测试理论（CTT）和来自项目反应理论（IRT）的模型。然而，IRT 还是 CTT 更适合分析 PRO 数据仍不清楚。本文比较了 CTT 和 IRT 的统计特性，包括功效和相应的效应大小。

方法

为了比较使用 IRT 或 CTT 分析的 PRO 数据，我们模拟了两组横断面研究。对于 IRT，根据是否假设项目或个体参数在一定程度上是已知的、具有良好到较差的精度，或者未知因此必须进行估计，对不同的情况进行了调查。比较了 IRT 或 CTT 获得的功效，并确定了对其影响最大的参数。

结果

当个体参数未知，而项目参数要么已知要么未知时，使用 IRT 或 CTT 获得的功效相似，并且始终低于使用正态分布终点的著名样本量公式预期的功效。两种方法的功效都受到项目数量的显著影响。

结论

在没有任何缺失数据的情况下，IRT 和 CTT 似乎提供了可比的功效。IRT 下，经典的 CTT 样本量公式在某些条件下似乎是合适的，但不适合 IRT。在 IRT 中，考虑项目数量以获得准确的公式似乎很重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6777/2858729/9f9b89c0da50/1471-2288-10-24-1.jpg

相似文献

Methodological issues regarding power of classical test theory (CTT) and item response theory (IRT)-based approaches for the comparison of patient-reported outcomes in two groups of patients--a simulation study.关于经典测量理论（CTT）和项目反应理论（IRT）方法在两组患者间比较患者报告结局的功效的方法学问题——一项模拟研究。

BMC Med Res Methodol. 2010 Mar 25;10:24. doi: 10.1186/1471-2288-10-24.

Measuring the ICF components of impairment, activity limitation and participation restriction: an item analysis using classical test theory and item response theory.测量损伤、活动受限和参与限制的国际功能、残疾和健康分类（ICF）成分：使用经典测试理论和项目反应理论的项目分析

Health Qual Life Outcomes. 2009 May 7;7:41. doi: 10.1186/1477-7525-7-41.

Does Scoring Method Impact Estimation of Significant Individual Changes Assessed by Patient-Reported Outcome Measures? Comparing Classical Test Theory Versus Item Response Theory.评分方法是否会影响患者报告结局测量评估的显著个体变化的估计？经典测量理论与项目反应理论的比较。

Value Health. 2023 Oct;26(10):1518-1524. doi: 10.1016/j.jval.2023.06.002. Epub 2023 Jun 12.

Relationships Among Classical Test Theory and Item Response Theory Frameworks via Factor Analytic Models.通过因子分析模型探讨经典测试理论与项目反应理论框架之间的关系。

Educ Psychol Meas. 2015 Jun;75(3):389-405. doi: 10.1177/0013164414559071. Epub 2014 Nov 20.

State of the psychometric methods: comments on the ISOQOL SIG psychometric papers.心理测量方法的现状：对国际生活质量研究学会（ISOQOL）特别兴趣小组心理测量论文的评论

J Patient Rep Outcomes. 2019 Jul 30;3(1):49. doi: 10.1186/s41687-019-0134-1.

What are the appropriate methods for analyzing patient-reported outcomes in randomized trials when data are missing?当数据缺失时，在随机试验中分析患者报告结局的适当方法有哪些？

Stat Methods Med Res. 2017 Dec;26(6):2897-2908. doi: 10.1177/0962280215615158. Epub 2015 Nov 6.

Estimating power for clinical trials with Patient Reported Outcomes - using Item Response Theory.使用项目反应理论估计患者报告结局临床试验的效能。

J Clin Epidemiol. 2022 Jan;141:141-148. doi: 10.1016/j.jclinepi.2021.10.002. Epub 2021 Oct 11.

Response pattern of depressive symptoms among college students: What lies behind items of the Beck Depression Inventory-II?大学生抑郁症状的反应模式：贝克抑郁量表二的项目背后隐藏着什么？

J Affect Disord. 2018 Jul;234:124-130. doi: 10.1016/j.jad.2018.02.064. Epub 2018 Mar 3.

Power and Sample Size Calculations in Clinical Trials with Patient-Reported Outcomes under Equal and Unequal Group Sizes Based on Graded Response Model: A Simulation Study.基于等级反应模型的相等和不相等组大小情况下患者报告结局的临床试验中的功效和样本量计算：一项模拟研究

Value Health. 2016 Jul-Aug;19(5):639-47. doi: 10.1016/j.jval.2016.03.1857. Epub 2016 Jul 29.

Biases and power for groups comparison on subjective health measurements.群体对主观健康测量进行比较时的偏差和权力。

PLoS One. 2012;7(10):e44695. doi: 10.1371/journal.pone.0044695. Epub 2012 Oct 24.

引用本文的文献

Tree-based latent variable model for assessing differential item functioning in patient-reported outcome measures: a simulation study.用于评估患者报告结局指标中项目功能差异的基于树的潜在变量模型：一项模拟研究。

Qual Life Res. 2025 Jul 18. doi: 10.1007/s11136-025-04018-6.

Validation of the Maslach Burnout Inventory-General Survey 9-item short version: psychometric properties and measurement invariance across age, gender, and continent.马氏职业倦怠量表通用版9项简版的验证：心理测量特性及跨年龄、性别和大洲的测量不变性

Front Psychol. 2024 Jul 16;15:1439470. doi: 10.3389/fpsyg.2024.1439470. eCollection 2024.

Instruments for Measuring Psychological Dimensions in Human-Robot Interaction: Systematic Review of Psychometric Properties.用于测量人机交互中心理维度的工具：心理测量特性的系统评价。

J Med Internet Res. 2024 Jun 5;26:e55597. doi: 10.2196/55597.

Adaptation and Validation of a French Version of the Vaccination Attitudes Examination (VAX) Scale.《疫苗接种态度调查问卷（VAX）量表法语版的改编与验证》

Vaccines (Basel). 2023 May 19;11(5):1001. doi: 10.3390/vaccines11051001.

Evaluating the impact of calibration of patient-reported outcomes measures on results from randomized clinical trials: a simulation study based on Rasch measurement theory.评估患者报告结局测量校准对随机临床试验结果的影响：基于拉什测量理论的模拟研究。

BMC Med Res Methodol. 2022 Aug 12;22(1):224. doi: 10.1186/s12874-022-01680-z.

Psychometric evaluation of a short-form version of the Swedish "Attitudes to and Knowledge of Oral Health" questionnaire.中文版“瑞典口腔健康态度与知识”问卷短式版的心理计量学评估。

BMC Geriatr. 2022 Jun 22;22(1):513. doi: 10.1186/s12877-022-03215-z.

Evaluations of the sum-score-based and item response theory-based tests of group mean differences under various simulation conditions.在各种模拟条件下，对基于总和得分和项目反应理论的组间均值差异测试的评估。

Stat Methods Med Res. 2021 Dec;30(12):2604-2618. doi: 10.1177/09622802211043263. Epub 2021 Oct 7.

Initial Validation of the Mindful Presence Scale: The Issue of the Construal Level of Scale Items.正念存在量表的初步验证：量表项目的解释水平问题。

Front Psychol. 2021 Jul 21;12:626084. doi: 10.3389/fpsyg.2021.626084. eCollection 2021.

An 11-Item Measure of User- and Human-Centered Design for Personal Health Tools (UCD-11): Development and Validation.个人健康工具的以用户和人为中心设计的11项量表（UCD - 11）：开发与验证

J Med Internet Res. 2021 Mar 16;23(3):e15032. doi: 10.2196/15032.

A Brief Assessment of Body Image Perception: Norm Values and Factorial Structure of the Short Version of the FKB-20.身体形象感知的简要评估：FKB - 20简版的常模值与因子结构

Front Psychol. 2020 Dec 1;11:579783. doi: 10.3389/fpsyg.2020.579783. eCollection 2020.

本文引用的文献

Analysis of longitudinal randomized clinical trials using item response models.应用项目反应模型分析纵向随机临床试验。

Contemp Clin Trials. 2009 Mar;30(2):158-70. doi: 10.1016/j.cct.2008.12.003. Epub 2008 Dec 24.

The prognostic significance of patient-reported outcomes in cancer clinical trials.患者报告结局在癌症临床试验中的预后意义。

J Clin Oncol. 2008 Mar 10;26(8):1355-63. doi: 10.1200/JCO.2007.13.3439. Epub 2008 Jan 28.

Standardizing patient-reported outcomes assessment in cancer clinical trials: a patient-reported outcomes measurement information system initiative.在癌症临床试验中规范患者报告结局评估：患者报告结局测量信息系统倡议

J Clin Oncol. 2007 Nov 10;25(32):5106-12. doi: 10.1200/JCO.2007.12.2341.

Measuring the concerns of cancer patients with low platelet counts: the Functional Assessment of Cancer Therapy--thrombocytopenia (FACT-Th) questionnaire.评估血小板计数低的癌症患者的担忧：癌症治疗功能评估——血小板减少症（FACT-Th）问卷。

Support Care Cancer. 2006 Dec;14(12):1220-31. doi: 10.1007/s00520-006-0102-1. Epub 2006 Aug 30.

Reflections on findings of the Cancer Outcomes Measurement Working Group: moving to the next phase.癌症结局测量工作组研究结果反思：迈向新阶段

J Natl Cancer Inst. 2005 Nov 2;97(21):1568-74. doi: 10.1093/jnci/dji337.

Use of item response theory to develop a shortened version of the EORTC QLQ-C30 emotional functioning scale.运用项目反应理论开发欧洲癌症研究与治疗组织核心生活质量问卷（QLQ-C30）情绪功能量表的简版。

Qual Life Res. 2004 Dec;13(10):1683-97. doi: 10.1007/s11136-004-7866-x.

The use of bootstrap methods for estimating sample size and analysing health-related quality of life outcomes.使用自助法估计样本量并分析与健康相关的生活质量结果。

Stat Med. 2005 Apr 15;24(7):1075-102. doi: 10.1002/sim.1984.

Power analysis in randomized clinical trials based on item response theory.基于项目反应理论的随机临床试验中的功效分析。

Control Clin Trials. 2003 Aug;24(4):390-410. doi: 10.1016/s0197-2456(03)00061-8.

Sample size calculations for ordered categorical data.有序分类数据的样本量计算。

Stat Med. 1993 Dec 30;12(24):2257-71. doi: 10.1002/sim.4780122404.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

关于经典测量理论（CTT）和项目反应理论（IRT）方法在两组患者间比较患者报告结局的功效的方法学问题——一项模拟研究。

Methodological issues regarding power of classical test theory (CTT) and item response theory (IRT)-based approaches for the comparison of patient-reported outcomes in two groups of patients--a simulation study.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献