Health Economics and Decision Science, School of Health and Related Research, University of Sheffield, West Court, 1 Mappin Street, Sheffield, S1 4DT, UK.
Eur J Health Econ. 2018 May;19(4):557-570. doi: 10.1007/s10198-017-0902-x. Epub 2017 May 30.
To assess the evidence on the validity and responsiveness of five commonly used preference-based instruments, the EQ-5D, SF-6D, HUI3, 15D and AQoL, by undertaking a review of reviews.
Four databases were investigated using a strategy refined through a highly sensitive filter for systematic reviews. References were screened and a search for grey literature was performed. Identified citations were scrutinized against pre-defined eligibility criteria and data were extracted using a customized extraction template. Evidence on known group validity, convergent validity and responsiveness was extracted and reviewed by narrative synthesis. Quality of the included reviews was assessed using a modified version of the AMSTAR checklist.
Thirty reviews were included, sixteen of which were of excellent or good quality. The body of evidence, covering more than 180 studies, was heavily skewed towards EQ-5D, with significantly fewer studies investigating HUI3 and SF-6D, and very few the 15D and AQoL. There was also lack of head-to-head comparisons between GPBMs and the tests reported by the reviews were often weak. Where there was evidence, EQ-5D, SF-6D, HUI3, 15D and AQoL seemed generally valid and responsive instruments, although not for all conditions. Evidence was not consistently reported across reviews.
Although generally valid, EQ-5D, SF-6D and HUI3 suffer from some problems and perform inconsistently in some populations. The lack of head-to-head comparisons and the poor reporting impedes the comparative assessment of the performance of GPBMs. This highlights the need for large comparative studies designed to test instruments' performance.
通过综述评价,评估 EQ-5D、SF-6D、HUI3、15D 和 AQoL 这五种常用偏好测量工具的有效性和反应度的证据。
采用一种通过高度敏感的系统评价筛选策略进行了优化的策略,调查了四个数据库。对参考文献进行了筛选,并对灰色文献进行了检索。根据预先确定的纳入标准对确定的引用进行了审查,并使用定制的提取模板提取了数据。通过叙述性综合,提取并审查了关于已知组有效性、收敛有效性和反应度的证据。使用 AMSTAR 检查表的修改版本评估了纳入研究的质量。
共纳入 30 项综述,其中 16 项为高质量或较好质量。涵盖超过 180 项研究的证据体严重偏向 EQ-5D,调查 HUI3 和 SF-6D 的研究明显较少,调查 15D 和 AQoL 的研究则非常少。此外,GPBMs 之间缺乏头对头比较,而综述中报告的测试通常也很薄弱。在有证据的情况下,EQ-5D、SF-6D、HUI3、15D 和 AQoL 似乎是一般有效的和敏感的工具,但并非适用于所有情况。证据并未在综述中得到一致报告。
尽管 EQ-5D、SF-6D 和 HUI3 通常有效,但它们存在一些问题,并且在某些人群中的表现不一致。缺乏头对头比较和报告不佳妨碍了 GPBM 性能的比较评估。这突出表明需要进行旨在测试仪器性能的大型比较研究。