University of California, San Francisco, CA 94118, United States.
Patient Educ Couns. 2009 Jul;76(1):106-12. doi: 10.1016/j.pec.2008.11.012. Epub 2009 Jan 1.
To compare the performance of categorical and continuous measures of patient knowledge in the context of risk communication about breast cancer, in terms of statistical and clinical significance as well as efficiency.
Twenty breast cancer patients provided estimates of 10-year mortality risk before and after their oncology visit. The oncologist reviewed risk estimates from Adjuvant!, a well-validated and commonly used prognostic model. Using the Adjuvant! estimates as a gold standard, we calculated how accurate the patient estimates were before and after the visit. We used three novel continuous measures of patient accuracy, the absolute bias, Brier, and Kullback-Leibler scores, and compared them to a categorical measure in terms of sensitivity to intervention effects. We also calculated the sample size required to replicate the primary study using the categorical and continuous measures, as a means of comparing efficiency.
In this sample, the Kullback-Leibler measure was most sensitive to the intervention effects (p=0.004), followed by Brier and absolute bias (both p=0.011), and finally the categorical measure (0.125). The sample size required to replicate the primary study was 18 for the Kullback-Leibler measure, 23 for absolute bias and Brier, and 37 for the categorical measure.
The continuous measures led to more efficient sample sizes and to rejection of the null hypothesis of no intervention effect. However, the difference in sensitivity of the continuous measures was not statistically significant, and the performance of the categorical measure depends on the researcher's categorical cutoff for accuracy. Continuous measures of patient accuracy may be more sensitive and efficient, while categorical measures may be more clinically relevant.
Researchers and others interested in assessing the accuracy of patient knowledge should weigh the trade-offs between clinical relevance and statistical significance while designing or evaluating risk communication studies.
在乳腺癌风险沟通的背景下,比较类别和连续测量患者知识的表现,从统计学和临床意义以及效率方面进行比较。
20 名乳腺癌患者在肿瘤就诊前和就诊后提供了 10 年死亡率风险的估计。肿瘤医生审查了 Adjuvant!的风险估计值,这是一种经过良好验证和广泛使用的预后模型。使用 Adjuvant!的估计值作为金标准,我们计算了患者就诊前后的估计值有多准确。我们使用三种新颖的连续测量患者准确性的方法,即绝对偏差、Brier 和 Kullback-Leibler 分数,并将它们与类别测量方法进行比较,以了解它们对干预效果的敏感性。我们还计算了使用类别和连续测量方法复制主要研究所需的样本量,作为比较效率的一种手段。
在本样本中,Kullback-Leibler 测量方法对干预效果最敏感(p=0.004),其次是 Brier 和绝对偏差(均 p=0.011),最后是类别测量方法(0.125)。复制主要研究所需的样本量为 Kullback-Leibler 测量方法 18 个,绝对偏差和 Brier 测量方法 23 个,类别测量方法 37 个。
连续测量方法导致了更有效的样本量,并拒绝了干预效果无差异的零假设。然而,连续测量方法的敏感性差异没有统计学意义,类别测量方法的性能取决于研究人员对准确性的类别截止值。患者准确性的连续测量方法可能更敏感和有效,而类别测量方法可能更具临床意义。
研究人员和其他对评估患者知识准确性感兴趣的人在设计或评估风险沟通研究时,应该权衡临床相关性和统计学意义之间的权衡。