Close James, Vandercappellen Jo, King Miriam, Hobart Jeremy
Peninsula Schools of Medicine and Dentistry, University of Plymouth, Plymouth, UK.
Novartis Pharma AG, Basel, Switzerland.
Neurol Ther. 2023 Oct;12(5):1649-1668. doi: 10.1007/s40120-023-00501-9. Epub 2023 Jun 23.
Poorly developed patient-reported outcome measures (PROs) risk type-II errors (i.e. false negatives) in clinical trials, resulting in erroneous failure to achieve trial endpoints. Validity is a fundamental requirement of fit-for-purpose PROs, with the main determinant of validity being the PROs items, i.e. content validity. Here, we sought to identify fatigue PRO instruments used in multiple sclerosis (MS) studies and to assess the extent to which their development satisfied current content validity standards.
We searched Embase and Medline for MS studies using fatigue-based PROs. Abstracts were screened, PROs identified, and their relevant development papers assessed against seven Consensus Standards for Measurement Instruments (COSMIN) criteria for content development.
From 3814 abstracts, 18 fatigue PROs met our inclusion criteria. Most PROs did not satisfy at least one COSMIN content validity standard. Frequent omissions during PRO development include: clearly defined constructs; conceptual frameworks; qualitative research in representative samples; and literature reviews. PRO development quality has improved significantly since FDA guidance was published (U = 10.0, p = 0.02). However, scatterplots and correlations between PRO COSMIN scores and citation frequency (rho = - 0.62) and clinical trials usage (rho = + 0.18) implied that PRO quality is unrelated to choice. COSMIN scores implied that the Fatigue Symptoms and Impact Questionnaire-Relapsing Multiple Sclerosis (FSIQ-RMS) and Neurological Fatigue Index-Multiple Sclerosis (NFI-MS) had the strongest evidence for adequate content validity.
Most existing fatigue PROs do not meet COSMIN content validity requirements. Although two PROs scored well on aggregate (NFI-MS and FSIQ-RMS), our subsequent evaluation of the item sets that generated their scores implied that both PROs have weaker content validity than COSMIN suggests. This indicates that COSMIN criteria require further development, and raises significant concerns about how we have measured one of the most common and burdensome MS symptoms. A detailed head-to-head psychometric evaluation is needed to determine the impact of different PRO development qualities and the implications of the problems implied by our analyses, on measurement performance.
在临床试验中,患者报告结局测量指标(PROs)发展不完善会导致II类错误(即假阴性),从而错误地判定未达到试验终点。有效性是适用的PROs的一项基本要求,有效性的主要决定因素是PROs条目,即内容效度。在此,我们旨在识别多发性硬化症(MS)研究中使用的疲劳PRO工具,并评估其开发在多大程度上符合当前的内容效度标准。
我们在Embase和Medline中检索使用基于疲劳的PROs的MS研究。筛选摘要,识别PROs,并根据测量工具的七个共识标准(COSMIN)内容开发标准评估其相关的开发论文。
从3814篇摘要中,18种疲劳PROs符合我们的纳入标准。大多数PROs至少未满足一项COSMIN内容效度标准。PRO开发过程中经常出现的遗漏包括:明确界定的概念;概念框架;代表性样本中的定性研究;以及文献综述。自美国食品药品监督管理局(FDA)发布指南以来,PRO开发质量有了显著提高(U = 10.0,p = 0.02)。然而,PRO COSMIN评分与引用频率(rho = -0.62)和临床试验使用情况(rho = +0.18)之间的散点图和相关性表明,PRO质量与选择无关。COSMIN评分表明,疲劳症状与影响问卷 - 复发型多发性硬化症(FSIQ - RMS)和神经疲劳指数 - 多发性硬化症(NFI - MS)在内容效度充分方面有最有力的证据。
大多数现有的疲劳PROs不符合COSMIN内容效度要求。尽管有两种PROs总体得分较高(NFI - MS和FSIQ - RMS),但我们随后对产生其得分的条目集的评估表明,这两种PROs的内容效度均比COSMIN所显示的要弱。这表明COSMIN标准需要进一步完善,并引发了对我们如何测量MS最常见且负担最重的症状之一的重大担忧。需要进行详细的直接心理测量学评估,以确定不同PRO开发质量的影响以及我们分析中所暗示问题对测量性能的影响。