EA 4275 Biostatistique, recherche clinique et mesures subjectives en santé, Faculté de Pharmacie, Université de Nantes, 1 rue Gaston Veil, 44035 Nantes Cedex 1, France.
BMC Med Res Methodol. 2010 Mar 25;10:24. doi: 10.1186/1471-2288-10-24.
Patients-Reported Outcomes (PRO) are increasingly used in clinical and epidemiological research. Two main types of analytical strategies can be found for these data: classical test theory (CTT) based on the observed scores and models coming from Item Response Theory (IRT). However, whether IRT or CTT would be the most appropriate method to analyse PRO data remains unknown. The statistical properties of CTT and IRT, regarding power and corresponding effect sizes, were compared.
Two-group cross-sectional studies were simulated for the comparison of PRO data using IRT or CTT-based analysis. For IRT, different scenarios were investigated according to whether items or person parameters were assumed to be known, to a certain extent for item parameters, from good to poor precision, or unknown and therefore had to be estimated. The powers obtained with IRT or CTT were compared and parameters having the strongest impact on them were identified.
When person parameters were assumed to be unknown and items parameters to be either known or not, the power achieved using IRT or CTT were similar and always lower than the expected power using the well-known sample size formula for normally distributed endpoints. The number of items had a substantial impact on power for both methods.
Without any missing data, IRT and CTT seem to provide comparable power. The classical sample size formula for CTT seems to be adequate under some conditions but is not appropriate for IRT. In IRT, it seems important to take account of the number of items to obtain an accurate formula.
患者报告的结局(PRO)越来越多地用于临床和流行病学研究。这些数据有两种主要的分析策略:基于观察得分的经典测试理论(CTT)和来自项目反应理论(IRT)的模型。然而,IRT 还是 CTT 更适合分析 PRO 数据仍不清楚。本文比较了 CTT 和 IRT 的统计特性,包括功效和相应的效应大小。
为了比较使用 IRT 或 CTT 分析的 PRO 数据,我们模拟了两组横断面研究。对于 IRT,根据是否假设项目或个体参数在一定程度上是已知的、具有良好到较差的精度,或者未知因此必须进行估计,对不同的情况进行了调查。比较了 IRT 或 CTT 获得的功效,并确定了对其影响最大的参数。
当个体参数未知,而项目参数要么已知要么未知时,使用 IRT 或 CTT 获得的功效相似,并且始终低于使用正态分布终点的著名样本量公式预期的功效。两种方法的功效都受到项目数量的显著影响。
在没有任何缺失数据的情况下,IRT 和 CTT 似乎提供了可比的功效。IRT 下,经典的 CTT 样本量公式在某些条件下似乎是合适的,但不适合 IRT。在 IRT 中,考虑项目数量以获得准确的公式似乎很重要。