Institute for Medical Informatics, Biometry and Epidemiology, University of Munich, Marchioninistrasse 15, 81377 Munich, Germany.
J Epidemiol Community Health. 2013 Jan;67(1):98-104. doi: 10.1136/jech-2011-200940. Epub 2012 Jul 31.
Systematic reviews are a cornerstone of evidence-based public health. The method of appraising the quality of different intervention and observational study designs in such reviews remains an important challenge. This article examines the applicability of selected quality appraisal tools (QATs) and the impact of choice of tool on the meta-analysis of a published systematic review.
The authors selected a systematic review on the effectiveness of hand washing with soap in preventing diarrhoea, covering a range of epidemiological study designs. 6 QATs were used to assess 13 studies meeting their inclusion criteria; component sections/questions were coded numerically to derive a summary score between -1 (low quality) and +1 (high quality) for each QAT and study. Heterogeneity in study quality was evaluated graphically using traffic light schemes and spider charts. Random effects meta-analysis was undertaken for all studies; sensitivity analyses for each QAT included only those studies with a score of 0 or above.
The authors found substantial heterogeneity in summary scores for a given study. Their main meta-analysis yielded an OR of 0.60 (95% CI 0.47 to 0.77) with most sensitivity analyses giving similar pooled effect sizes with wider CIs.
The six QATs differ greatly in applicability across study designs, approach to quality appraisal (ie, scale vs checklist, presence/absence of summary score), coverage of domains and quality of component questions and answers. Learning from advantages and disadvantages of each QAT, we recommend research into the development of a reliable QAT with a broad applicability across study designs.
系统评价是循证公共卫生的基石。在这类评价中,评估不同干预和观察性研究设计质量的方法仍然是一个重要挑战。本文考察了一些选定的质量评估工具(QAT)的适用性,以及工具选择对已发表系统评价荟萃分析的影响。
作者选择了一篇关于用肥皂洗手预防腹泻的有效性的系统评价,涵盖了一系列流行病学研究设计。使用 6 种 QAT 评估了符合纳入标准的 13 项研究;将各部分/问题进行数字编码,为每个 QAT 和研究得出一个介于-1(低质量)和+1(高质量)的综合评分。使用红绿灯方案和蜘蛛图图形评估研究质量的异质性。对所有研究进行了随机效应荟萃分析;对每个 QAT 的敏感性分析仅包括评分在 0 或以上的研究。
作者发现,给定研究的综合评分存在很大的异质性。他们的主要荟萃分析得出的 OR 为 0.60(95%CI 0.47 至 0.77),大多数敏感性分析得出的汇总效应大小相似,但 CI 更宽。
这 6 种 QAT 在适用于不同研究设计、质量评估方法(即量表与检查表、是否有综合评分)、涵盖的领域以及组成问题和答案的质量方面存在很大差异。从每个 QAT 的优缺点中吸取教训,我们建议研究开发一种具有广泛适用性的可靠 QAT。