Fullerton Birgit, Pöhlmann Boris, Krohn Robert, Adams John L, Gerlach Ferdinand M, Erler Antje
Institute of General Practice, Johann Wolfgang Goethe-University, Frankfurt, Germany.
AQUA Institute (Institute for Applied Quality Improvement and Research in Health Care), Göttingen, Germany.
Health Serv Res. 2016 Oct;51(5):1960-80. doi: 10.1111/1475-6773.12452. Epub 2016 Feb 3.
To present a case study on how to compare various matching methods applying different measures of balance and to point out some pitfalls involved in relying on such measures.
Administrative claims data from a German statutory health insurance fund covering the years 2004-2008.
We applied three different covariance balance diagnostics to a choice of 12 different matching methods used to evaluate the effectiveness of the German disease management program for type 2 diabetes (DMPDM2). We further compared the effect estimates resulting from applying these different matching techniques in the evaluation of the DMPDM2.
The choice of balance measure leads to different results on the performance of the applied matching methods. Exact matching methods performed well across all measures of balance, but resulted in the exclusion of many observations, leading to a change of the baseline characteristics of the study sample and also the effect estimate of the DMPDM2. All PS-based methods showed similar effect estimates. Applying a higher matching ratio and using a larger variable set generally resulted in better balance. Using a generalized boosted instead of a logistic regression model showed slightly better performance for balance diagnostics taking into account imbalances at higher moments.
Best practice should include the application of several matching methods and thorough balance diagnostics. Applying matching techniques can provide a useful preprocessing step to reveal areas of the data that lack common support. The use of different balance diagnostics can be helpful for the interpretation of different effect estimates found with different matching methods.
呈现一个关于如何运用不同平衡度量方法比较各种匹配方法的案例研究,并指出依赖此类度量方法所涉及的一些陷阱。
来自德国法定健康保险基金的2004 - 2008年行政索赔数据。
我们将三种不同的协方差平衡诊断方法应用于12种不同的匹配方法,这些方法用于评估德国2型糖尿病疾病管理项目(DMPDM2)的有效性。我们还比较了在评估DMPDM2时应用这些不同匹配技术所得到的效应估计值。
平衡度量方法的选择会导致所应用匹配方法的性能产生不同结果。精确匹配方法在所有平衡度量方法中表现良好,但导致许多观测值被排除,从而使研究样本的基线特征发生变化,同时也改变了DMPDM2的效应估计值。所有基于倾向得分的方法显示出相似的效应估计值。应用更高的匹配比例并使用更大的变量集通常会带来更好的平衡。在考虑高阶矩不平衡的平衡诊断中,使用广义增强模型而非逻辑回归模型表现略好。
最佳实践应包括应用多种匹配方法和进行全面的平衡诊断。应用匹配技术可以提供一个有用的预处理步骤,以揭示数据中缺乏共同支持的区域。使用不同的平衡诊断方法有助于解释通过不同匹配方法得到的不同效应估计值。