Service de Biostatistique et Information Médicale, Hôpital Saint-Louis, UMR-S717 Inserm; Sorbonne Paris Cité, Université Paris Diderot, 1 avenue Claude Vellefaux, Paris 75010, France.
BMC Med Res Methodol. 2012 May 30;12:70. doi: 10.1186/1471-2288-12-70.
Propensity score (PS) methods are increasingly used, even when sample sizes are small or treatments are seldom used. However, the relative performance of the two mainly recommended PS methods, namely PS-matching or inverse probability of treatment weighting (IPTW), have not been studied in the context of small sample sizes.
We conducted a series of Monte Carlo simulations to evaluate the influence of sample size, prevalence of treatment exposure, and strength of the association between the variables and the outcome and/or the treatment exposure, on the performance of these two methods.
Decreasing the sample size from 1,000 to 40 subjects did not substantially alter the Type I error rate, and led to relative biases below 10%. The IPTW method performed better than the PS-matching down to 60 subjects. When N was set at 40, the PS matching estimators were either similarly or even less biased than the IPTW estimators. Including variables unrelated to the exposure but related to the outcome in the PS model decreased the bias and the variance as compared to models omitting such variables. Excluding the true confounder from the PS model resulted, whatever the method used, in a significantly biased estimation of treatment effect. These results were illustrated in a real dataset.
Even in case of small study samples or low prevalence of treatment, PS-matching and IPTW can yield correct estimations of treatment effect unless the true confounders and the variables related only to the outcome are not included in the PS model.
倾向评分(PS)方法越来越多地被使用,即使在样本量较小或治疗方法很少使用的情况下。然而,在样本量较小的情况下,两种主要推荐的 PS 方法,即 PS 匹配或治疗反概率加权(IPTW)的相对性能尚未得到研究。
我们进行了一系列蒙特卡罗模拟,以评估样本量、治疗暴露的流行率以及变量与结局和/或治疗暴露之间的关联强度对这两种方法性能的影响。
将样本量从 1000 减少到 40 个,对 I 型错误率没有显著影响,并导致相对偏差低于 10%。IPTW 方法的性能优于 PS 匹配,直至 60 个样本。当 N 设置为 40 时,PS 匹配估计值与 IPTW 估计值的偏差要么相似,要么甚至更小。在 PS 模型中纳入与暴露无关但与结局相关的变量,与排除此类变量的模型相比,会降低偏差和方差。无论使用哪种方法,将真正的混杂因素从 PS 模型中排除,都会导致治疗效果的估计产生显著偏差。这些结果在一个真实的数据集得到了说明。
即使在研究样本量较小或治疗方法流行率较低的情况下,PS 匹配和 IPTW 也可以得出正确的治疗效果估计,除非真正的混杂因素和仅与结局相关的变量未被纳入 PS 模型。