Jing Yaqi, Lin Lifeng
Boehringer Ingelheim Pharmaceuticals, Inc, Ridgefield, CT, USA.
Department of Epidemiology and Biostatistics, University of Arizona, Tucson, AZ, USA.
JBI Evid Synth. 2024 Mar 1;22(3):394-405. doi: 10.11124/JBIES-23-00368.
When conducting systematic reviews and meta-analyses of continuous outcomes, the mean differences (MDs) and standardized mean differences (SMDs) are 2 commonly used choices for effect measures. The SMDs are motivated by scenarios where studies collected in a systematic review do not report the continuous measures on the same scale. The standardization process transfers the MDs to be unit-free measures that can be synthesized across studies. As such, some evidence synthesis researchers tend to prefer the SMD over the MD. However, other researchers have concerns about the interpretability of the SMD. The standardization process could also yield additional heterogeneity between studies. In this paper, we use simulation studies to illustrate that, in a scenario where the continuous measures are on the same scale, the SMD could have considerably poorer performance compared with the MD in some cases. The simulations compare the MD and SMD in various settings, including cases where the normality assumption of continuous measures does not hold. We conclude that although the SMD remains useful for evidence synthesis of continuous measures on different scales, the SMD could have substantially greater biases, greater mean squared errors, and lower coverage probabilities of CIs than the MD. The MD is generally more robust to the violation of the normality assumption for continuous measures. In scenarios where continuous measures are inherently comparable or can be transformed to a common scale, the MD is the preferred choice for an effect measure.
在对连续性结局进行系统评价和Meta分析时,平均差(MDs)和标准化平均差(SMDs)是效应量的两种常用选择。标准化平均差的产生是因为在系统评价中收集的研究未报告相同尺度上的连续性测量指标。标准化过程将平均差转换为无量纲的指标,以便在不同研究间进行合并。因此,一些证据综合研究人员倾向于选择标准化平均差而非平均差。然而,其他研究人员对标准化平均差的可解释性存在担忧。标准化过程也可能在研究间产生额外的异质性。在本文中,我们通过模拟研究表明,在连续性测量指标尺度相同的情况下,标准化平均差在某些情况下的表现可能比平均差差得多。模拟在各种情况下比较了平均差和标准化平均差,包括连续性测量指标的正态性假设不成立的情况。我们得出结论,虽然标准化平均差对于不同尺度的连续性测量指标的证据综合仍然有用,但与平均差相比,标准化平均差可能存在更大的偏差、更大的均方误差和更低的置信区间覆盖概率。平均差通常对连续性测量指标正态性假设的违背更具稳健性。在连续性测量指标本质上具有可比性或可以转换为共同尺度的情况下,平均差是效应量的首选。