Rutten-van Mölken M P, Bakker C H, van Doorslaer E K, van der Linden S
Department of Health Economics, University of Limburg, Maastricht, The Netherlands.
Med Care. 1995 Sep;33(9):922-37. doi: 10.1097/00005650-199509000-00004.
This article explores various methodological issues of patient utility measurement in two randomized controlled clinical trials involving 85 patients with fibromyalgia and 144 with ankylosing spondylitis. In both trials one baseline and two follow-up measurements of the patients' preferences for their own health state and several hypothetical states were performed using the rating scale and the standard gamble methods. It was confirmed that standard gamble scores are consistently higher than rating scale scores for both the experienced and the hypothetical states. The 3-month test-retest reliability for hypothetical states measured by intraclass correlation coefficients ranged from 0.24 to 0.33 for the rating scale and from 0.43 to 0.70 for the standard gamble. Although the reproducibility is not high, the group mean scores are fairly stable over time. Mean standard gamble scores tend to differ depending on the way the measurements are undertaken. Utilities elicited with chained gambles were significantly higher than utilities elicited with basic reference gambles. At the individual level some inconsistent responses occurred. However, more than 70% of these fell within the bounds of the measurement error, which ranged from 0.11 to 0.13 on the standard gamble (0-1 scale) and from 8 to 10 on the rating scale (0-100 scale). The large number of negative utilities for the severe hypothetical state, which was used as an anchor point in the chained gambles, and the magnitude of these negative utilities (down to -19) calls for intensified research efforts to handle these responses in utility calculations.
本文探讨了在两项随机对照临床试验中患者效用测量的各种方法学问题,这两项试验涉及85名纤维肌痛患者和144名强直性脊柱炎患者。在这两项试验中,使用评级量表和标准博弈法对患者对自身健康状态及几种假设状态的偏好进行了一次基线测量和两次随访测量。结果证实,无论是对于实际状态还是假设状态,标准博弈得分始终高于评级量表得分。通过组内相关系数测量的假设状态的3个月重测信度,评级量表为0.24至0.33,标准博弈为0.43至0.70。虽然再现性不高,但组平均得分随时间相当稳定。平均标准博弈得分往往因测量方式的不同而有所差异。连锁博弈得出的效用显著高于基本参考博弈得出的效用。在个体层面出现了一些不一致的反应。然而,其中超过70%的反应落在测量误差范围内,标准博弈(0 - 1量表)的测量误差范围为0.11至0.13,评级量表(0 - 100量表)的测量误差范围为8至10。在连锁博弈中用作锚点的严重假设状态出现了大量负效用,且这些负效用的幅度(低至 - 19)要求加大研究力度,以在效用计算中处理这些反应。