Perleth Matthias, Langer Gero, Meerpohl Joerg J, Gartlehner Gerald, Kaminski-Hartenthaler Angela, Schünemann Holger J
Abteilung Fachberatung Medizin, Gemeinsamer Bundesausschuss, Berlin.
Z Evid Fortbild Qual Gesundhwes. 2012;106(10):733-44. doi: 10.1016/j.zefq.2012.10.018. Epub 2012 Nov 16.
This article deals with inconsistency of relative, rather than absolute, treatment effects in binary/dichotomous outcomes. A body of evidence is not rated up in quality if studies yield consistent results, but may be rated down in quality if inconsistent. Criteria for evaluating consistency include similarity of point estimates, extent of overlap of confidence intervals, and statistical criteria including tests of heterogeneity and I(2). To explore heterogeneity, systematic review authors should generate and test a small number of a priori hypotheses related to patients, interventions, outcomes, and methodology. When inconsistency is large and unexplained, rating down quality for inconsistency is appropriate, particularly if some studies suggest substantial benefit, and others no effect or harm (rather than only large versus small effects). Apparent subgroup effects may be spurious. Credibility is increased if subgroup effects are based on a small number of a priori hypotheses with a specified direction; subgroup comparisons come from within rather than between studies; tests of interaction generate low p-values; and have a biological rationale.
本文探讨的是二元/二分结局中相对治疗效果而非绝对治疗效果的不一致性。如果研究结果一致,一组证据的质量不会被提高,但如果结果不一致,其质量可能会被降低。评估一致性的标准包括点估计值的相似性、置信区间的重叠程度以及包括异质性检验和I(2)在内的统计标准。为了探究异质性,系统评价的作者应提出并检验少量与患者、干预措施、结局和方法学相关的先验假设。当不一致性很大且无法解释时,因不一致性而降低质量评级是合适的,特别是当一些研究表明有显著益处,而另一些研究表明无效果或有害(而不是仅为效果大小不同)时。明显的亚组效应可能是虚假的。如果亚组效应基于少量具有特定方向的先验假设;亚组比较来自研究内部而非研究之间;交互作用检验产生低p值;并且有生物学依据,那么可信度会提高。