University of Bristol, Bristol, United Kingdom (D.M.P., N.J.W., D.M.C., A.A.).
University of Bristol, Bristol, United Kingdom, and University of York, Heslington, York, United Kingdom (S.D.).
Ann Intern Med. 2019 Apr 16;170(8):538-546. doi: 10.7326/M18-3542. Epub 2019 Mar 26.
Guideline development requires the synthesis of evidence on several treatments of interest, typically by using network meta-analysis (NMA). Because treatment effects may be estimated imprecisely or be based on evidence lacking internal or external validity, guideline developers must assess the robustness of recommendations made on the basis of the NMA to potential limitations in the evidence. Such limitations arise because the observed estimates differ from the true effects of interest, for example, because of study biases, sampling variation, or issues of relevance. The widely used GRADE (Grading of Recommendations Assessment, Development and Evaluation) framework aims to assess the quality of evidence supporting a recommendation by using a structured series of qualitative judgments. This article argues that GRADE approaches proposed for NMA are insufficient for the purposes of guideline development, because the influence of the evidence on the final recommendation is not taken into account. It outlines threshold analysis as an alternative approach, demonstrating the method with 2 examples of clinical guidelines from the National Institute for Health and Care Excellence (NICE) in the United Kingdom. Threshold analysis quantifies precisely how much the evidence could change (for any reason, such as potential biases, or simply sampling variation) before the recommendation changes, and what the revised recommendation would be. If it is judged that the evidence could not plausibly change by more than this amount, then the recommendation is considered robust; otherwise, it is sensitive to plausible changes in the evidence. In this manner, threshold analysis directly informs decision makers and guideline developers of the robustness of treatment recommendations.
指南的制定需要综合几种有价值的治疗方法的证据,通常采用网络荟萃分析(NMA)。由于治疗效果的估计可能不够准确,或者基于缺乏内部或外部有效性的证据,指南制定者必须评估 NMA 基础上提出的建议对证据潜在局限性的稳健性。这些局限性是由于观察到的估计值与实际感兴趣的效果不同,例如由于研究偏倚、抽样变化或相关性问题。广泛使用的 GRADE(推荐评估、制定与评价)框架旨在通过一系列结构化的定性判断来评估支持推荐的证据质量。本文认为,针对 NMA 提出的 GRADE 方法对于指南制定来说是不够的,因为没有考虑证据对最终推荐的影响。本文概述了阈值分析作为一种替代方法,并通过英国国家卫生与保健优化研究所(NICE)的 2 个临床指南示例演示了该方法。阈值分析精确地量化了证据可能发生变化的程度(由于任何原因,如潜在偏倚或仅仅是抽样变化),在此之前推荐会发生变化,以及修订后的推荐意见是什么。如果判断证据不可能发生如此程度的变化,那么该推荐是稳健的;否则,该推荐对证据中的合理变化敏感。通过这种方式,阈值分析直接向决策者和指南制定者告知治疗建议的稳健性。