Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, Vascular Sciences and Public Health, University of Padova, Via Loredan 18, 35121, Padova, Italy.
Department of Cardiac, Thoracic,Vascular Sciences and Public Health, University of Padova, Padova, Italy.
BMC Med Res Methodol. 2021 Nov 22;21(1):256. doi: 10.1186/s12874-021-01454-z.
Propensity score matching is a statistical method that is often used to make inferences on the treatment effects in observational studies. In recent years, there has been widespread use of the technique in the cardiothoracic surgery literature to evaluate to potential benefits of new surgical therapies or procedures. However, the small sample size and the strong dependence of the treatment assignment on the baseline covariates that often characterize these studies make such an evaluation challenging from a statistical point of view. In such settings, the use of propensity score matching in combination with oversampling and replacement may provide a solution to these issues by increasing the initial sample size of the study and thus improving the statistical power that is needed to detect the effect of interest. In this study, we review the use of propensity score matching in combination with oversampling and replacement in small sample size settings.
We performed a series of Monte Carlo simulations to evaluate how the sample size, the proportion of treated, and the assignment mechanism affect the performances of the proposed approaches. We assessed the performances with overall balance, relative bias, root mean squared error and nominal coverage. Moreover, we illustrate the methods using a real case study from the cardiac surgery literature.
Matching without replacement produced estimates with lower bias and better nominal coverage than matching with replacement when 1:1 matching was considered. In contrast to that, matching with replacement showed better balance, relative bias, and root mean squared error than matching without replacement for increasing levels of oversampling. The best nominal coverage was obtained by using the estimator that accounts for uncertainty in the matching procedure on sets of units obtained after matching with replacement.
The use of replacement provides the most reliable treatment effect estimates and that no more than 1 or 2 units from the control group should be matched to each treated observation. Moreover, the variance estimator that accounts for the uncertainty in the matching procedure should be used to estimate the treatment effect.
倾向评分匹配是一种常用于观察性研究中推断治疗效果的统计方法。近年来,该技术在心胸外科文献中得到了广泛应用,以评估新的手术治疗或手术程序的潜在益处。然而,这些研究通常存在样本量小和治疗分配强烈依赖基线协变量的问题,这使得从统计学角度评估治疗效果具有挑战性。在这种情况下,通过增加研究的初始样本量并提高检测感兴趣效果所需的统计功效,使用倾向评分匹配结合过采样和替换可能是解决这些问题的一种方法。在本研究中,我们回顾了在小样本量环境下使用倾向评分匹配结合过采样和替换的方法。
我们进行了一系列蒙特卡罗模拟,以评估样本量、处理比例和分配机制如何影响所提出方法的性能。我们通过整体平衡、相对偏差、均方根误差和名义覆盖率来评估性能。此外,我们还使用来自心脏外科文献的真实案例研究来说明这些方法。
当考虑 1:1 匹配时,无替换匹配产生的估计值偏差较低,名义覆盖率较好,而有替换匹配的偏差较大,名义覆盖率较差。相比之下,对于增加的过采样水平,有替换匹配显示出更好的平衡、相对偏差和均方根误差,而无替换匹配则显示出更好的匹配。在使用考虑匹配过程不确定性的估计量对替换匹配后获得的单位集进行估计时,获得了最佳的名义覆盖率。
使用替换可以提供最可靠的治疗效果估计,并且每个处理观测值不应与对照组超过 1 或 2 个单位进行匹配。此外,应使用考虑匹配过程不确定性的方差估计量来估计治疗效果。