Suppr超能文献

对倾向得分模型是否小题大做?比较倾向得分匹配方法。

Too much ado about propensity score models? Comparing methods of propensity score matching.

作者信息

Baser Onur

机构信息

Thomson-Medstat, Ann Arbor, MI, USA.

出版信息

Value Health. 2006 Nov-Dec;9(6):377-85. doi: 10.1111/j.1524-4733.2006.00130.x.

Abstract

OBJECTIVE

A large number of possible techniques are available when conducting matching procedures, yet coherent guidelines for selecting the most appropriate application do not yet exist. In this article we evaluate several matching techniques and provide a suggested guideline for selecting the best technique.

METHODS

The main purpose of a matching procedure is to reduce selection bias by increasing the balance between the treatment and control groups. The following approach, consisting of five quantifiable steps, is proposed to check for balance: 1) Using two sample t-statistics to compare the means of the treatment and control groups for each explanatory variable; 2) Comparing the mean difference as a percentage of the average standard deviations; 3) Comparing percent reduction of bias in the means of the explanatory variables before and after matching; 4) Comparing treatment and control density estimates for the explanatory variables; and 5) Comparing the density estimates of the propensity scores of the control units with those of the treated units. We investigated seven different matching techniques and how they performed with regard to proposed five steps. Moreover, we estimate the average treatment effect with multivariate analysis and compared the results with the estimates of propensity score matching techniques. The Medstat MarketScan Data Base provided data for use in empirical examples of the utility of several matching methods. We conducted nearest neighborhood matching (NNM) analyses in seven ways: replacement, 2 to 1 matching, Mahalanobis matching (MM), MM with caliper, kernel matching, radius matching, and the stratification method.

RESULTS

Comparing techniques according to the above criteria revealed that the choice of matching has significant effects on outcomes. Patients with asthma are compared with patients without asthma and cost of illness ranged from 2040 dollars to 4463 dollars depending on the type of matching. After matching, we looked at the insignificant differences or larger P-values in the mean values (criterion 1); low mean differences as a percentage of the average standard deviation (criterion 2); 100% reduction bias in the means of explanatory variables (criterion 3); and insignificant differences when comparing the density estimates of the treatment and control groups (criterion 4 and criterion 5). Mahalanobis matching with caliber yielded the better results according all five criteria (Mean = 4463 dollars, SD = 3252 dollars). We also applied multivariate analysis over the matched sample. This decreased the deviation in cost of illness estimates more than threefold (Mean = 4456 dollars, SD = 996 dollars).

CONCLUSION

Sensitivity analysis of the matching techniques is especially important because none of the proposed methods in the literature is a priori superior to the others. The suggested joint consideration of propensity score matching and multivariate analysis offers an approach to assessing the robustness of the estimates.

摘要

目的

在进行匹配程序时,有大量可能的技术可供使用,但尚无关于选择最合适应用的连贯指南。在本文中,我们评估了几种匹配技术,并提供了选择最佳技术的建议指南。

方法

匹配程序的主要目的是通过提高治疗组和对照组之间的平衡性来减少选择偏倚。建议采用以下由五个可量化步骤组成的方法来检查平衡性:1)使用两个样本t统计量来比较每个解释变量在治疗组和对照组中的均值;2)将均值差异作为平均标准差的百分比进行比较;3)比较匹配前后解释变量均值的偏差减少百分比;4)比较解释变量的治疗组和对照组密度估计值;5)比较对照组单位与治疗组单位倾向得分的密度估计值。我们研究了七种不同的匹配技术以及它们在上述五个步骤中的表现。此外,我们使用多变量分析估计平均治疗效果,并将结果与倾向得分匹配技术的估计值进行比较。Medstat MarketScan数据库提供了数据,用于几种匹配方法效用的实证示例。我们以七种方式进行最近邻匹配(NNM)分析:替换、2对1匹配、马氏匹配(MM)、带卡尺的MM、核匹配、半径匹配和分层方法。

结果

根据上述标准比较技术发现,匹配的选择对结果有显著影响。将哮喘患者与非哮喘患者进行比较,疾病成本根据匹配类型在2040美元至4463美元之间。匹配后,我们查看了均值中的无显著差异或更大的P值(标准1);均值差异作为平均标准差百分比的低值(标准2);解释变量均值的偏差减少100%(标准3);以及比较治疗组和对照组密度估计值时的无显著差异(标准4和标准5)。带卡尺的马氏匹配在所有五个标准下都产生了更好的结果(均值 = 4463美元,标准差 = 3252美元)。我们还对匹配样本应用了多变量分析。这使疾病成本估计的偏差降低了三倍多(均值 = 4456美元,标准差 = 996美元)。

结论

匹配技术的敏感性分析尤为重要,因为文献中提出的方法没有一种在先天条件上优于其他方法。建议联合考虑倾向得分匹配和多变量分析,提供了一种评估估计稳健性的方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验