在评估临床试验中临床结局评估的背景下研究项目反应理论模型的性能。

Investigating item response theory model performance in the context of evaluating clinical outcome assessments in clinical trials.

作者信息

Ayasse Nicolai D, Coon Cheryl D

机构信息

Clinical Outcome Assessment Program, Critical Path Institute, Tucson, AZ, USA.

出版信息

Qual Life Res. 2025 Apr;34(4):1125-1136. doi: 10.1007/s11136-024-03873-z. Epub 2024 Dec 12.

DOI:10.1007/s11136-024-03873-z

PMID:39666253

Abstract

PURPOSE

Item response theory (IRT) models are an increasingly popular method choice for evaluating clinical outcome assessments (COAs) for use in clinical trials. Given common constraints in clinical trial design, such as limits on sample size and assessment lengths, the current study aimed to examine the appropriateness of commonly used polytomous IRT models, specifically the graded response model (GRM) and partial credit model (PCM), in the context of how they are frequently used for psychometric evaluation of COAs in clinical trials.

METHODS

Data were simulated under varying sample sizes, measure lengths, response category numbers, and slope strengths, as well as under conditions that violated some model assumptions, namely, unidimensionality and equality of item slopes. Model fit, detection of item local dependence, and detection of item misfit were all examined to identify conditions where one model may be preferable or results may contain a degree of bias.

RESULTS

For unidimensional item sets and equal item slopes, the PCM and GRM performed similarly, and GRM performance remained consistent as slope variability increased. For not-unidimensional item sets, the PCM was somewhat more sensitive to this unidimensionality violation. Looking across conditions, the PCM did not demonstrate a clear advantage over the GRM for small sample sizes or shorter measure lengths.

CONCLUSION

Overall, the GRM and the PCM each demonstrated advantages and disadvantages depending on underlying data conditions and the model outcome investigated. We recommend careful consideration of the known, or expected, data characteristics when choosing a model and interpreting its results.

摘要

目的

项目反应理论（IRT）模型是评估用于临床试验的临床结局评估（COA）时越来越常用的方法选择。鉴于临床试验设计中的常见限制，如样本量和评估长度的限制，本研究旨在探讨常用的多分类IRT模型，特别是等级反应模型（GRM）和部分计分模型（PCM），在它们常用于临床试验中COA的心理测量评估的背景下的适用性。

方法

在不同的样本量、测量长度、反应类别数量和斜率强度下模拟数据，以及在违反一些模型假设的条件下，即单维性和项目斜率相等的条件下模拟数据。检查模型拟合、项目局部依赖性的检测和项目不拟合的检测，以确定一个模型可能更可取或结果可能存在一定程度偏差的条件。

结果

对于单维项目集和相等的项目斜率，PCM和GRM的表现相似，并且随着斜率变异性的增加，GRM的表现保持一致。对于非单维项目集，PCM对这种单维性违反更为敏感。综合各种条件来看，对于小样本量或较短的测量长度，PCM并没有显示出比GRM有明显优势。

结论

总体而言，GRM和PCM各有优缺点，这取决于潜在的数据条件和所研究的模型结果。我们建议在选择模型并解释其结果时，仔细考虑已知的或预期的数据特征。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

在评估临床试验中临床结局评估的背景下研究项目反应理论模型的性能。

Investigating item response theory model performance in the context of evaluating clinical outcome assessments in clinical trials.

作者信息

机构信息

出版信息

PURPOSE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

本文引用的文献

在评估临床试验中临床结局评估的背景下研究项目反应理论模型的性能。

Investigating item response theory model performance in the context of evaluating clinical outcome assessments in clinical trials.

作者信息

机构信息

出版信息

PURPOSE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

本文引用的文献