Arpino Bruno, Cannas Massimo
Department of Political and Social Sciences, Universitat Pompeu Fabra, Barcelona, Spain.
Department of Economic and Business Science, University of Cagliari, Via Sant'Ignazio 17, Cagliari, 09124, Italy.
Stat Med. 2016 May 30;35(12):2074-91. doi: 10.1002/sim.6880. Epub 2016 Feb 1.
This article focuses on the implementation of propensity score matching for clustered data. Different approaches to reduce bias due to cluster-level confounders are considered and compared using Monte Carlo simulations. We investigated methods that exploit the clustered structure of the data in two ways: in the estimation of the propensity score model (through the inclusion of fixed or random effects) or in the implementation of the matching algorithm. In addition to a pure within-cluster matching, we also assessed the performance of a new approach, 'preferential' within-cluster matching. This approach first searches for control units to be matched to treated units within the same cluster. If matching is not possible within-cluster, then the algorithm searches in other clusters. All considered approaches successfully reduced the bias due to the omission of a cluster-level confounder. The preferential within-cluster matching approach, combining the advantages of within-cluster and between-cluster matching, showed a relatively good performance both in the presence of big and small clusters, and it was often the best method. An important advantage of this approach is that it reduces the number of unmatched units as compared with a pure within-cluster matching. We applied these methods to the estimation of the effect of caesarean section on the Apgar score using birth register data. Copyright © 2016 John Wiley & Sons, Ltd.
本文重点关注聚类数据倾向得分匹配的实施。使用蒙特卡罗模拟来考虑和比较不同的方法,以减少由于聚类层面混杂因素导致的偏差。我们研究了以两种方式利用数据聚类结构的方法:在倾向得分模型的估计中(通过纳入固定或随机效应)或在匹配算法的实施中。除了纯粹的聚类内匹配外,我们还评估了一种新方法——“优先”聚类内匹配的性能。这种方法首先在同一聚类内寻找要与处理单元匹配的对照单元。如果在聚类内无法匹配,那么算法就在其他聚类中搜索。所有考虑的方法都成功地减少了由于遗漏聚类层面混杂因素而导致的偏差。“优先”聚类内匹配方法结合了聚类内匹配和聚类间匹配的优点,在存在大聚类和小聚类的情况下都表现出相对较好的性能,并且它常常是最佳方法。与纯粹的聚类内匹配相比,这种方法的一个重要优点是它减少了未匹配单元的数量。我们将这些方法应用于使用出生登记数据估计剖宫产对阿氏评分的影响。版权所有© 2016约翰·威利父子有限公司。