Morgano Gian Paolo, Wiercioch Wojtek, Piovani Daniele, Neumann Ignacio, Nieuwlaat Robby, Piggott Thomas, Alonso-Coello Pablo, Mbuagbaw Lawrence, Rigoni Marta, Bognanni Antonio, Celedon Natalia, Mustafa Reem A, Pottie Kevin, Leontiadis Grigorios I, Akl Elie A, Bonovas Stefanos, Schünemann Holger J
European Commission, Joint Research Centre (JRC), Ispra, Italy; Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada.
Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, Ontario, Canada; Michael G. DeGroote Cochrane Canada & McMaster GRADE Centres, McMaster University, Hamilton, Ontario, Canada.
J Clin Epidemiol. 2025 Mar;179:111639. doi: 10.1016/j.jclinepi.2024.111639. Epub 2024 Dec 10.
GRADE and other evidence to decision (EtD) frameworks are widely used by guideline development groups (GDG) and other decision-makers. When GDGs judge the magnitude of desirable and undesirable health outcomes on EtDs, they typically categorize them as trivial, small, moderate, or large. However, generic judgment or decision thresholds (DTs) that could guide the user about such estimates of effect size or serve as references for interpretation of findings are not yet available. The objective of this study was to empirically derive DTs for EtD judgments about the magnitude of dichotomously assessed health benefits and harms.
We conducted a methodological randomized controlled trial to derive empirical DTs across conditions and health outcomes. We invited stakeholders, including clinicians, epidemiologists, decision scientists, health research methodologists, experts in health technology assessment (HTA), members of GDGs, patient representatives, and the public to participate in the trial. We employed randomly assigned case scenarios to elicit ranges of absolute risk differences judged as small and moderate effects from study participants. We then used the collected data to derive empirical DTs. We also investigated the validity of our DTs by measuring the agreement between judgments that were made by GDGs in the past and the judgments that our DTs approach would suggest if applied to the same guideline data.
A total of 445 stakeholders accessed the survey of which 409 were randomised and 288 rated at least one case scenario. Based on these participants, the study findings support our a priori hypothesis of a difference in the DTs for trivial, small, moderate, and large effects and are suggestive of a relation between raters' judgments and the joint measure of absolute effects and outcome values. The results permit the use and calculation of DTs for a variety of scenarios and we present three ways of how to use the results practically.
In this trial we confirmed that empirically derived DTs discriminate between judgments on the EtDs. These DTs can be used for judgments about desirable and undesirable health effects in systematic reviews or to initiate and inform a discussion with a GDG. This ensures consistency in judgments across different guideline questions and promotes transparency in judgments.
Decision thresholds (DTs) help with determining if effects of interventions should be considered absent, small, moderate or large. In this study we derived an overarching approach for these thresholds across conditions and outcomes. The results of this study, a randomized experiment, will help guideline developers and other decision-makers to make these judgments objectively. They will be particularly relevant for the use in Grading of Recommendations Assessment, Development, and Evaluation (GRADE) evidence to decision (EtD) frameworks.
推荐分级的评估、制定与评价(GRADE)及其他证据到决策(EtD)框架被指南制定小组(GDG)和其他决策者广泛使用。当GDG根据EtD判断期望和非期望健康结果的程度时,他们通常将其分类为微不足道、小、中等或大。然而,尚无通用的判断或决策阈值(DT)可指导用户进行此类效应大小估计或作为研究结果解释的参考。本研究的目的是通过实证得出关于二分法评估的健康益处和危害程度的EtD判断的DT。
我们进行了一项方法学随机对照试验,以得出不同条件和健康结果下的实证DT。我们邀请了包括临床医生、流行病学家、决策科学家、健康研究方法学家、卫生技术评估(HTA)专家、GDG成员、患者代表和公众在内的利益相关者参与试验。我们采用随机分配的案例场景,以获取研究参与者判断为小和中等效应的绝对风险差异范围。然后,我们使用收集到的数据得出实证DT。我们还通过测量GDG过去所做判断与我们的DT方法应用于相同指南数据时所建议的判断之间的一致性,来研究我们的DT的有效性。
共有445名利益相关者访问了该调查,其中409人被随机分组,288人对至少一个案例场景进行了评分。基于这些参与者,研究结果支持了我们关于微不足道、小、中等和大效应的DT存在差异的先验假设,并表明评分者的判断与绝对效应和结果值的联合测量之间存在关联。结果允许在各种场景中使用和计算DT,我们介绍了三种实际使用结果的方法。
在本试验中,我们证实了通过实证得出的DT能够区分基于EtD的判断。这些DT可用于系统评价中对期望和非期望健康效应的判断,或启动并为与GDG的讨论提供信息。这确保了不同指南问题判断的一致性,并促进了判断的透明度。
决策阈值(DT)有助于确定干预措施的效果应被视为不存在、小、中等还是大。在本研究中,我们得出了一种适用于不同条件和结果的这些阈值的总体方法。这项随机试验的结果将帮助指南制定者和其他决策者客观地做出这些判断。它们对于在推荐分级的评估、制定与评价(GRADE)证据到决策(EtD)框架中的应用尤其相关。