Center for Real-World Effectiveness and Safety of Therapeutics and the Center for Clinical Epidemiology and Biostatistics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
Pharmacoepidemiol Drug Saf. 2024 Jan;33(1):e5678. doi: 10.1002/pds.5678. Epub 2023 Aug 23.
High-dimensional propensity score (hdPS) is a semiautomated method that leverages a vast number of covariates available in healthcare databases to improve confounding adjustment. A novel combined Super Learner (SL)-hdPS approach was proposed to assist with selecting the number of covariates for propensity score inclusion, and was found in plasmode simulation studies to improve bias reduction and precision compared to hdPS alone. However, the approach has not been examined in the applied setting.
We compared SL-hdPS's performance with that of several hdPS models, each with prespecified covariates and a different number of empirically-identified covariates, using a cohort study comparing real-world bleeding rates between ibrutinib- and bendamustine-rituximab (BR)-treated individuals with chronic lymphocytic leukemia in Optum's de-identified Clinformatics® Data Mart commercial claims database (2013-2020). We used inverse probability of treatment weighting for confounding adjustment and Cox proportional hazards regression to estimate hazard ratios (HRs) for bleeding outcomes. Parameters of interest included prespecified and empirically-identified covariate balance (absolute standardized difference [ASD] thresholds of <0.10 and <0.05) and outcome HR precision (95% confidence intervals).
We identified 2423 ibrutinib- and 1102 BR-treated individuals. Including >200 empirically-identified covariates in the hdPS model compromised covariate balance at both ASD thresholds. SL-hdPS balanced more covariates than all individual hdPS models at both ASD thresholds. The bleeding HR 95% confidence intervals were generally narrower with SL-hdPS than with individual hdPS models.
In a real-world application, hdPS was sensitive to the number of covariates included, while use of SL for covariate selection resulted in improved covariate balance and possibly improved precision.
高维倾向评分(hdPS)是一种半自动方法,利用医疗保健数据库中大量可用的协变量来改善混杂调整。提出了一种新的组合超级学习者(SL)-hdPS 方法来辅助选择倾向评分纳入的协变量数量,在 plasmode 模拟研究中发现,与单独使用 hdPS 相比,该方法可以降低偏差并提高精度。然而,该方法尚未在实际应用中进行检验。
我们使用 Optum 的去标识 Clinformatics®Data Mart 商业索赔数据库(2013-2020 年)中一项比较慢性淋巴细胞白血病患者伊布替尼和苯达莫司汀联合利妥昔单抗(BR)治疗的真实世界出血率的队列研究,比较了 SL-hdPS 与几种 hdPS 模型的性能,每种模型都有预设协变量和不同数量的经验识别协变量。我们使用治疗反概率加权法进行混杂调整,并使用 Cox 比例风险回归估计出血结局的风险比(HR)。感兴趣的参数包括预设和经验识别协变量平衡(绝对标准化差异[ASD]阈值分别为<0.10 和<0.05)和结局 HR 精度(95%置信区间)。
我们确定了 2423 例伊布替尼治疗患者和 1102 例 BR 治疗患者。在 hdPS 模型中纳入>200 个经验识别的协变量会使两个 ASD 阈值的协变量平衡都受到影响。在两个 ASD 阈值上,SL-hdPS 平衡的协变量都多于所有单个 hdPS 模型。与单个 hdPS 模型相比,SL-hdPS 的出血 HR 95%置信区间通常更窄。
在实际应用中,hdPS 对纳入的协变量数量敏感,而使用 SL 进行协变量选择可改善协变量平衡,并可能提高精度。