Department of Medicine, Johns Hopkins University School of Medicine, 624 N Broadway, Baltimore, MD 21205, USA.
J Gen Intern Med. 2012 Jun;27 Suppl 1(Suppl 1):S47-55. doi: 10.1007/s11606-012-2021-9.
Grading the strength of a body of diagnostic test evidence involves challenges over and above those related to grading the evidence from health care intervention studies. This chapter identifies challenges and outlines principles for grading the body of evidence related to diagnostic test performance.
Diagnostic test evidence is challenging to grade because standard tools for grading evidence were designed for questions about treatment rather than diagnostic testing; and the clinical usefulness of a diagnostic test depends on multiple links in a chain of evidence connecting the performance of a test to changes in clinical outcomes.
Reviewers grading the strength of a body of evidence on diagnostic tests should consider the principle domains of risk of bias, directness, consistency, and precision, as well as publication bias, dose response association, plausible unmeasured confounders that would decrease an effect, and strength of association, similar to what is done to grade evidence on treatment interventions. Given that most evidence regarding the clinical value of diagnostic tests is indirect, an analytic framework must be developed to clarify the key questions, and strength of evidence for each link in that framework should be graded separately. However if reviewers choose to combine domains into a single grade of evidence, they should explain their rationale for a particular summary grade and the relevant domains that were weighed in assigning the summary grade.
对诊断性试验证据的整体质量进行分级,面临着比评价医疗干预研究证据分级更大的挑战。本章介绍了诊断性试验证据分级的相关挑战,并概述了用于分级诊断性试验性能证据整体质量的原则。
诊断性试验证据的分级具有挑战性,原因在于用于评价治疗相关证据的标准工具,并不适用于诊断性试验;而且诊断性试验的临床实用性取决于将试验性能与临床结局改变相关联的证据链中的多个环节。
对诊断性试验证据整体质量进行分级的评价者,应考虑偏倚风险、直接性、一致性和精确性等原则领域,以及发表偏倚、剂量-反应关系、可能存在的降低效应的未测量混杂因素和关联强度,这与对治疗干预措施证据进行分级的方法类似。鉴于大多数关于诊断性试验临床价值的证据都是间接的,必须制定一个分析框架来阐明关键问题,并且应该分别对该框架中的每个环节的证据强度进行分级。但是,如果评价者选择将各个领域综合为一个整体的证据等级,他们应该解释其选择特定的综合等级的理由,以及在分配综合等级时考虑的相关领域。