Integrative Tumor Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, Maryland, USA.
Cancer Prevention Fellowship Program, Division of Cancer Prevention, National Cancer Institute, Rockville, Maryland, USA.
Stat Med. 2020 Jul 30;39(17):2308-2323. doi: 10.1002/sim.8540. Epub 2020 Apr 16.
Currently, methods for conducting multiple treatment propensity scoring in the presence of high-dimensional covariate spaces that result from "big data" are lacking-the most prominent method relies on inverse probability treatment weighting (IPTW). However, IPTW only utilizes one element of the generalized propensity score (GPS) vector, which can lead to a loss of information and inadequate covariate balance in the presence of multiple treatments. This limitation motivates the development of a novel propensity score method that uses the entire GPS vector to establish a scalar balancing score that, when adjusted for, achieves covariate balance in the presence of potentially high-dimensional covariates. Specifically, the generalized propensity score cumulative distribution function (GPS-CDF) method is introduced. A one-parameter power function fits the CDF of the GPS vector and a resulting scalar balancing score is used for matching and/or stratification. Simulation results show superior performance of the new method compared to IPTW both in achieving covariate balance and estimating average treatment effects in the presence of multiple treatments. The proposed approach is applied to a study derived from electronic medical records to determine the causal relationship between three different vasopressors and mortality in patients with non-traumatic aneurysmal subarachnoid hemorrhage. Results suggest that the GPS-CDF method performs well when applied to large observational studies with multiple treatments that have large covariate spaces.
目前,在存在“大数据”导致的高维协变量空间的情况下,进行多次治疗倾向评分的方法尚不完善——最突出的方法依赖于逆概率治疗加权(Inverse Probability Treatment Weighting,简称 IPTW)。然而,IPTW 仅利用广义倾向评分(Generalized Propensity Score,简称 GPS)向量的一个元素,这可能导致在存在多种治疗方法时信息丢失和协变量不平衡。这种局限性促使开发了一种新的倾向评分方法,该方法使用整个 GPS 向量来建立标量平衡评分,通过调整该评分,可以在存在潜在高维协变量的情况下实现协变量平衡。具体来说,引入了广义倾向评分累积分布函数(Generalized Propensity Score Cumulative Distribution Function,简称 GPS-CDF)方法。一个单参数幂函数拟合 GPS 向量的 CDF,并且一个由此产生的标量平衡评分用于匹配和/或分层。模拟结果表明,与 IPTW 相比,新方法在实现协变量平衡和估计存在多种治疗方法时的平均治疗效果方面表现更好。该方法应用于从电子病历中得出的一项研究,以确定三种不同血管加压药与非创伤性蛛网膜下腔出血患者死亡率之间的因果关系。结果表明,当应用于具有大协变量空间和多种治疗方法的大型观察性研究时,GPS-CDF 方法表现良好。