Deeks Jonathan J
Centre for Statistics in Medicine, Institute of Health Sciences, Old Road, Headington, Oxford OX3 7LF, UK.
Stat Med. 2002 Jun 15;21(11):1575-600. doi: 10.1002/sim.1188.
Meta-analysis of binary data involves the computation of a weighted average of summary statistics calculated for each trial. The selection of the appropriate summary statistic is a subject of debate due to conflicts in the relative importance of mathematical properties and the ability to intuitively interpret results. This paper explores the process of identifying a summary statistic most likely to be consistent across trials when there is variation in control group event rates. Four summary statistics are considered: odds ratios (OR); risk differences (RD) and risk ratios of beneficial (RR(B)); and harmful outcomes (RR(H)). Each summary statistic corresponds to a different pattern of predicted absolute benefit of treatment with variation in baseline risk, the greatest difference in patterns of prediction being between RR(B) and RR(H). Selection of a summary statistic solely based on identification of the best-fitting model by comparing tests of heterogeneity is problematic, principally due to low numbers of trials. It is proposed that choice of a summary statistic should be guided by both empirical evidence and clinically informed debate as to which model is likely to be closest to the expected pattern of treatment benefit across baseline risks. Empirical investigations comparing the four summary statistics on a sample of 551 systematic reviews provide evidence that the RR and OR models are on average more consistent than RD, there being no difference on average between RR and OR. From a second sample of 114 meta-analyses evidence indicates that for interventions aimed at preventing an undesirable event, greatest absolute benefits are observed in trials with the highest baseline event rates, corresponding to the model of constant RR(H). The appropriate selection for a particular meta-analysis may depend on understanding reasons for variation in control group event rates; in some situations uncertainty about the choice of summary statistic will remain.
二元数据的Meta分析涉及对每个试验计算的汇总统计量进行加权平均。由于数学性质的相对重要性与直观解释结果的能力之间存在冲突,因此选择合适的汇总统计量一直是一个有争议的问题。本文探讨了在对照组事件发生率存在差异的情况下,确定最有可能在各试验中保持一致的汇总统计量的过程。考虑了四个汇总统计量:比值比(OR);风险差(RD)以及有益结局的风险比(RR(B));和有害结局的风险比(RR(H))。每个汇总统计量对应于治疗的预测绝对获益随基线风险变化的不同模式,预测模式的最大差异存在于RR(B)和RR(H)之间。仅基于通过比较异质性检验来识别最佳拟合模型来选择汇总统计量是有问题的,主要原因是试验数量较少。建议汇总统计量的选择应以经验证据和关于哪种模型可能最接近跨基线风险的预期治疗获益模式的临床知情辩论为指导。对551篇系统评价样本进行的比较四个汇总统计量的实证研究提供了证据,表明RR和OR模型平均比RD更一致,RR和OR之间平均没有差异。从114项Meta分析的第二个样本中得到的证据表明,对于旨在预防不良事件的干预措施,在基线事件发生率最高的试验中观察到最大的绝对获益,这与恒定RR(H)模型相对应。特定Meta分析的合适选择可能取决于理解对照组事件发生率差异的原因;在某些情况下,汇总统计量选择的不确定性仍然存在。
Stat Med. 2002-6-15
Cochrane Database Syst Rev. 2025-5-6
N Engl J Stat Data Sci. 2024-10