Department of Clinical Pharmacology, School of Medicine, Faculty of Health, Witten/Herdecke University, Witten, Germany.
Department of Medical Statistics, University Medical Center Göttingen, Göttingen, Germany.
BMC Med Res Methodol. 2022 Aug 30;22(1):234. doi: 10.1186/s12874-022-01715-5.
Systematic reviews that synthesize safety outcomes pose challenges (e.g. rare events), which raise questions for grading the strength of the body of evidence. This is maybe one reason why in many potentially inappropriate medication (PIM) lists the recommendations are not based on formalized systems for assessing the quality of the body of evidence such as GRADE. In this contribution, we describe specifications and suggest adaptions of the GRADE system for grading the quality of evidence on safety outcomes, which were developed in the context of preparing a PIM-list, namely PRISCUS.
We systematically assessed each of the five GRADE domains for rating-down (study limitations, imprecision, inconsistency, indirectness, publication bias) and the criteria for rating-up, considering if special considerations or revisions of the original approach were indicated. The result was gathered in a written document and discussed in a group-meeting of five members with various background until consensus. Subsequently, we performed a proof-of-concept application using a convenience sample of systematic reviews and applied the approach to systematic reviews on 19 different clinical questions.
We describe specifications and suggest adaptions for the criteria "study limitations", imprecision, "publication bias" and "rating-up for large effect". In addition, we suggest a new criterion to account for data from subgroup-analyses. The proof-of-concept application did not reveal a need for further revision and thus we used the approach for the systematic reviews that were prepared for the PRISCUS-list. We assessed 51 outcomes. Each of the proposed adaptions was applied. There were neither an excessive number of low and very low ratings, nor an excessive number of high ratings, but the different methodological quality of the safety outcomes appeared to be well reflected.
The suggestions appear to have the potential to overcome some of the challenges when grading the methodological quality of harms and thus may be helpful for producers of evidence syntheses considering safety.
系统评价综合安全性结果存在挑战(例如罕见事件),这对证据体强度的分级提出了质疑。这也许是许多潜在不适当药物(PIM)清单中的建议不是基于评估证据体质量的正式系统(如 GRADE)的原因之一。在本研究中,我们描述了在为准备 PIM 清单(即 PRISCUS)制定的 GRADE 系统用于对安全性结果进行质量分级的规格,并提出了一些调整建议。
我们系统地评估了 GRADE 系统用于降级(研究局限性、不精确性、不一致性、间接性、发表偏倚)和升级标准的五个领域中的每一个领域,同时考虑是否需要特殊考虑或对原始方法进行修订。结果汇总在一份书面文件中,并在一个由五名不同背景成员组成的小组会议上进行讨论,直到达成共识。随后,我们使用一个方便的系统评价样本进行了概念验证应用,并将该方法应用于 19 个不同临床问题的系统评价。
我们描述了用于“研究局限性”、不精确性、“发表偏倚”和“大效应的升级”标准的规格和调整建议。此外,我们建议了一个新的标准来考虑亚组分析的数据。概念验证应用并未显示需要进一步修订,因此我们将该方法应用于为 PRISCUS 清单准备的系统评价中。我们评估了 51 个结局。应用了每种建议的调整方法。既没有过多的低级别和极低级别评级,也没有过多的高级别评级,但安全性结果的不同方法学质量似乎得到了很好的反映。
这些建议似乎有可能克服在对危害的方法学质量进行分级时的一些挑战,因此对于考虑安全性的证据综合生产者可能会有所帮助。