当使用倾向评分的一对一匹配时,选择与每个治疗对象相匹配的未治疗对象的最佳数量的统计标准。
Statistical criteria for selecting the optimal number of untreated subjects matched to each treated subject when using many-to-one matching on the propensity score.
机构信息
Institute for Clinical Evaluative Sciences, Toronto, Ontario, Canada.
出版信息
Am J Epidemiol. 2010 Nov 1;172(9):1092-7. doi: 10.1093/aje/kwq224. Epub 2010 Aug 28.
Propensity-score matching is increasingly being used to estimate the effects of treatments using observational data. In many-to-one (M:1) matching on the propensity score, M untreated subjects are matched to each treated subject using the propensity score. The authors used Monte Carlo simulations to examine the effect of the choice of M on the statistical performance of matched estimators. They considered matching 1-5 untreated subjects to each treated subject using both nearest-neighbor matching and caliper matching in 96 different scenarios. Increasing the number of untreated subjects matched to each treated subject tended to increase the bias in the estimated treatment effect; conversely, increasing the number of untreated subjects matched to each treated subject decreased the sampling variability of the estimated treatment effect. Using nearest-neighbor matching, the mean squared error of the estimated treatment effect was minimized in 67.7% of the scenarios when 1:1 matching was used. Using nearest-neighbor matching or caliper matching, the mean squared error was minimized in approximately 84% of the scenarios when, at most, 2 untreated subjects were matched to each treated subject. The authors recommend that, in most settings, researchers match either 1 or 2 untreated subjects to each treated subject when using propensity-score matching.
倾向评分匹配越来越多地被用于使用观察数据估计治疗效果。在倾向评分的一对一(M:1)匹配中,使用倾向评分将 M 个未治疗的对象与每个治疗的对象进行匹配。作者使用蒙特卡罗模拟来研究选择 M 对匹配估计器的统计性能的影响。他们考虑在 96 种不同情况下使用最近邻匹配和卡尺匹配将 1-5 个未治疗的对象与每个治疗的对象匹配。将每个治疗对象匹配的未治疗对象的数量增加往往会增加估计治疗效果的偏差;相反,将每个治疗对象匹配的未治疗对象的数量增加会降低估计治疗效果的抽样变异性。在使用最近邻匹配时,当使用 1:1 匹配时,在 67.7%的情况下,估计治疗效果的均方误差最小。使用最近邻匹配或卡尺匹配,当最多将 2 个未治疗的对象与每个治疗的对象匹配时,在大约 84%的情况下,估计治疗效果的均方误差最小。作者建议,在大多数情况下,研究人员在使用倾向评分匹配时,将每个治疗对象匹配 1 或 2 个未治疗的对象。