Cordero Cynthia P, Dans Antonio L
Department of Clinical Epidemiology, College of Medicine University of the Philippines, Manila, Philippines.
Department of Clinical Epidemiology, College of Medicine University of the Philippines, Manila, Philippines; Department of Medicine College of Medicine University of the Philippines Manila-Philippine General Hospital, Manila, Philippines.
J Clin Epidemiol. 2021 Feb;130:149-151. doi: 10.1016/j.jclinepi.2020.09.045.
In a meta-analysis, a question always arises. Is it worthwhile to combine estimates from studies of different populations using various formulations of an intervention, evaluating outcomes measured differently? Sometimes even study designs differ. Differences are expected in a meta-analysis. These may be negligible, and a pooled estimate of effect can guide the clinical decision. However, when the differences are large, this estimate may mislead. Effect estimates from study to study differ because of real differences (between-study variability) and because of chance (within-study variability). To combine estimates when there is heterogeneity (between-study differences are large) may not be sensible. Two complementary methods may be used to detect heterogeneity: visual inspection of the forest plot and calculating numerical measures of heterogeneity (I and Q). Visual inspection can show effects that are different from the rest. A large I (proportion of overall variability attributed to between-study variation) or a small P-value associated with Q may suggest heterogeneity. Large P-values, however, do not mean the absence of heterogeneity. It is more informative to report the confidence interval of the I. If there is no heterogeneity, a pooled estimate of the true effect may be generated using only within-study variation (fixed-effect model). If there is substantial heterogeneity, reasons should be sought. Subgroup analysis or meta-regression using study-level characteristics may be done. Although more involved and potentially challenging, individual-level data (Individual Participant Data, IPD) may also be used. In the case of unexplained heterogeneity, both within- and between-study variation should be used to generate a pooled estimate (random-effects model). This estimate does not estimate a single true effect but estimates the average of a range of effects of the intervention on populations represented by the studies. If precise enough (narrow confidence interval), this estimate, together with the prediction interval (a measure of uncertainty in the effect one might see in a particular context), can guide clinical and policy decisions.
在一项荟萃分析中,总会出现一个问题。将来自不同人群的研究估计值进行合并是否值得?这些研究使用了干预措施的不同形式,评估的结果测量方式也不同。有时甚至研究设计也存在差异。荟萃分析中出现差异是意料之中的。这些差异可能微不足道,效应的合并估计值可指导临床决策。然而,当差异很大时,这个估计值可能会产生误导。研究之间的效应估计值不同,原因包括实际差异(研究间变异性)和随机因素(研究内变异性)。当存在异质性(研究间差异很大)时合并估计值可能并不明智。可使用两种互补方法来检测异质性:森林图的视觉检查以及计算异质性的数值指标(I和Q)。视觉检查可以显示出与其他结果不同的效应。较大的I(总体变异性中归因于研究间变异的比例)或与Q相关的较小P值可能表明存在异质性。然而,较大的P值并不意味着不存在异质性。报告I的置信区间更具信息量。如果不存在异质性,可以仅使用研究内变异性来生成真实效应的合并估计值(固定效应模型)。如果存在实质性异质性,则应寻找原因。可以进行亚组分析或使用研究水平特征的荟萃回归。虽然更复杂且可能具有挑战性,但也可以使用个体水平数据(个体参与者数据,IPD)。在无法解释异质性的情况下,应同时使用研究内和研究间变异性来生成合并估计值(随机效应模型)。这个估计值并非估计单一的真实效应,而是估计干预措施对研究中所代表人群的一系列效应的平均值。如果足够精确(置信区间狭窄),这个估计值连同预测区间(在特定情况下可能看到的效应的不确定性度量)可以指导临床和政策决策。