Suppr超能文献

采用贝叶斯项目反应理论估计以患者报告结局为终点的临床试验效能。

Bayesian item response theory to estimate power in clinical trials with patient-reported outcomes as endpoints.

作者信息

Mei Xiaohang, Cappelleri Joseph C, Hu Jinxiang

机构信息

Department of Biostatistics and Data Science, University of Kansas Medical Center, Kansas City, KS, USA.

Statistical Research and Data Science Center, Pfizer Inc, New York, NY, USA.

出版信息

Qual Life Res. 2025 Apr;34(4):1113-1124. doi: 10.1007/s11136-024-03874-y. Epub 2025 Jan 8.

Abstract

PURPOSE

Patient-Reported Outcomes (PROs) are widely used in clinical trials, epidemiological research, quality of life (QOL) studies, routine clinical care, and medical surveillance. The Patient Reported Outcomes Measurement Information System (PROMIS) is a system of reliable and standardized measures of PROs developed with Item Response Theory (IRT) using latent scores. Power estimation is critical to clinical trials and research designs. However, in clinical trials with PROs as endpoints, observed scores are often used to calculate power rather than latent scores.

METHODS

In this paper, we conducted a series of simulations to compare the power obtained with IRT latent scores, including Bayesian IRT, Frequentist IRT, and observed scores, focusing on small sample size common in pilot studies and Phase I/II trials. Taking the PROMIS depression measures as an example, we simulated data and estimated power for two-armed clinical trials manipulating the following factors: sample size, effect size, and number of items. We also examined how misspecification of effect size affected power estimation.

RESULTS

Our results showed that the Bayesian IRT, which incorporated prior information into latent score estimation, yielded the highest power, especially when sample size was small. The effect of misspecification diminished as sample size increased.

CONCLUSION

For power estimation in two-armed clinical trials with standardized PRO endpoints, if a medium effect size or larger is expected, we recommend BIRT simulation with well-grounded informative priors and a total sample size of at least 40.

摘要

目的

患者报告结局(PROs)广泛应用于临床试验、流行病学研究、生活质量(QOL)研究、常规临床护理和医学监测。患者报告结局测量信息系统(PROMIS)是一个使用潜在分数通过项目反应理论(IRT)开发的可靠且标准化的PROs测量系统。功效估计对于临床试验和研究设计至关重要。然而,在以PROs为终点的临床试验中,通常使用观察分数而非潜在分数来计算功效。

方法

在本文中,我们进行了一系列模拟,以比较使用IRT潜在分数(包括贝叶斯IRT、频率论IRT)和观察分数所获得的功效,重点关注试点研究和I/II期试验中常见的小样本量情况。以PROMIS抑郁测量为例,我们模拟数据并估计双臂临床试验在以下因素下的功效:样本量、效应量和项目数量。我们还研究了效应量的错误设定如何影响功效估计。

结果

我们的结果表明,将先验信息纳入潜在分数估计的贝叶斯IRT产生的功效最高,尤其是在样本量较小时。随着样本量增加,错误设定的影响会减小。

结论

对于具有标准化PRO终点的双臂临床试验中的功效估计,如果预期效应量为中等或更大,我们建议使用具有充分依据的信息性先验且总样本量至少为40的贝叶斯IRT模拟。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验