Atkinson Thomas M, Reeve Bryce B, Dueck Amylou C, Bennett Antonia V, Mendoza Tito R, Rogak Lauren J, Basch Ethan, Li Yuelin
Department of Psychiatry & Behavioral Sciences, Memorial Sloan Kettering Cancer Center, 641 Lexington Ave., 7th Floor, New York, NY, 10022, USA.
Duke University Medical Center, Durham, NC, USA.
J Patient Rep Outcomes. 2018 Dec 4;2(1):56. doi: 10.1186/s41687-018-0086-x.
Traditional concordance metrics have shortcomings based on dataset characteristics (e.g., multiple attributes rated, missing data); therefore it is necessary to explore supplemental approaches to quantifying agreement between independent assessments. The purpose of this methodological paper is to apply an Item Response Theory (IRT) -based framework to an existing dataset that included unidimensional clinician and multiple attribute patient ratings of symptomatic adverse events (AEs), and explore the utility of this method in patient-reported outcome (PRO) and health-related quality of life (HRQOL) research.
Data were derived from a National Cancer Institute-sponsored study examining the validity of a measurement system (PRO-CTCAE) for patient self-reporting of AEs in cancer patients receiving treatment (N = 940). AEs included 13 multiple attribute patient-reported symptoms that had corresponding unidimensional clinician AE grades. A Bayesian IRT Model was fitted to calculate the latent grading thresholds between raters. The posterior mean values of the model-fitted item responses were calculated to represent model-based AE grades obtained from patients and clinicians.
Model-based AE grades showed a general pattern of clinician underestimation relative to patient-graded AEs. However, the magnitude of clinician underestimation was associated with AE severity, such that clinicians' underestimation was more pronounced for moderate/very severe model-estimated AEs, and less so with mild AEs.
The Bayesian IRT approach reconciles multiple symptom attributes and elaborates on the patterns of clinician-patient non-concordance beyond that provided by traditional metrics. This IRT-based technique may be used as a supplemental tool to detect and characterize nuanced differences in patient-, clinician-, and proxy-based ratings of HRQOL and patient-centered outcomes.
ClinicalTrials.gov NCT01031641 . Registered 1 December 2009.
传统的一致性指标基于数据集特征存在缺陷(例如,多个属性评分、缺失数据);因此,有必要探索补充方法来量化独立评估之间的一致性。本方法学论文的目的是将基于项目反应理论(IRT)的框架应用于一个现有数据集,该数据集包含有症状不良事件(AE)的单维临床医生评分和多属性患者评分,并探讨该方法在患者报告结局(PRO)和健康相关生活质量(HRQOL)研究中的效用。
数据来自一项由美国国立癌症研究所资助的研究,该研究考察了一种测量系统(PRO-CTCAE)在接受治疗的癌症患者中对AE进行患者自我报告的有效性(N = 940)。AE包括13种多属性患者报告的症状,这些症状有相应的单维临床医生AE分级。拟合贝叶斯IRT模型以计算评分者之间的潜在分级阈值。计算模型拟合项目反应的后验均值,以代表从患者和临床医生获得的基于模型的AE分级。
相对于患者分级的AE,基于模型的AE分级显示出临床医生普遍低估的模式。然而,临床医生低估的程度与AE严重程度相关,即临床医生对中度/非常严重的模型估计AE的低估更为明显,而对轻度AE的低估则较少。
贝叶斯IRT方法协调了多个症状属性,并详细阐述了临床医生与患者不一致的模式,超出了传统指标所提供的范围。这种基于IRT的技术可作为一种补充工具,用于检测和描述基于患者、临床医生和代理人的HRQOL评分以及以患者为中心的结局中的细微差异。
ClinicalTrials.gov NCT01031641。2009年12月1日注册。