Department of Epidemiology and Biostatistics, Drexel University School of Public Health, Philadelphia, Pennsylvania, United States of America.
PLoS One. 2011 Mar 31;6(3):e18174. doi: 10.1371/journal.pone.0018174.
Propensity score weighting is sensitive to model misspecification and outlying weights that can unduly influence results. The authors investigated whether trimming large weights downward can improve the performance of propensity score weighting and whether the benefits of trimming differ by propensity score estimation method. In a simulation study, the authors examined the performance of weight trimming following logistic regression, classification and regression trees (CART), boosted CART, and random forests to estimate propensity score weights. Results indicate that although misspecified logistic regression propensity score models yield increased bias and standard errors, weight trimming following logistic regression can improve the accuracy and precision of final parameter estimates. In contrast, weight trimming did not improve the performance of boosted CART and random forests. The performance of boosted CART and random forests without weight trimming was similar to the best performance obtainable by weight trimmed logistic regression estimated propensity scores. While trimming may be used to optimize propensity score weights estimated using logistic regression, the optimal level of trimming is difficult to determine. These results indicate that although trimming can improve inferences in some settings, in order to consistently improve the performance of propensity score weighting, analysts should focus on the procedures leading to the generation of weights (i.e., proper specification of the propensity score model) rather than relying on ad-hoc methods such as weight trimming.
倾向评分加权对模型的误设定和异常权重很敏感,这些权重可能会过度影响结果。作者研究了向下修剪大权重是否可以提高倾向评分加权的性能,以及修剪的好处是否因倾向评分估计方法而异。在一项模拟研究中,作者研究了以下方法的性能:逻辑回归、分类和回归树(CART)、提升 CART 和随机森林来估计倾向评分权重。结果表明,尽管逻辑回归倾向评分模型的误设定会导致偏差和标准误差增加,但逻辑回归后的权重修剪可以提高最终参数估计的准确性和精度。相比之下,权重修剪并没有提高提升 CART 和随机森林的性能。未经权重修剪的提升 CART 和随机森林的性能与使用逻辑回归估计的最佳修剪权重的性能相似。虽然修剪可以优化使用逻辑回归估计的倾向评分权重,但很难确定最佳修剪水平。这些结果表明,尽管修剪可以在某些情况下改善推断,但为了始终如一地提高倾向评分加权的性能,分析人员应专注于生成权重的程序(即正确指定倾向评分模型),而不是依赖于权重修剪等特定方法。