Kreif Noémi, Gruber Susan, Radice Rosalba, Grieve Richard, Sekhon Jasjeet S
Department of Health Services Research and Policy, London School of Hygiene and Tropical Medicine, London, UK
Department of Epidemiology, Harvard School of Public Health, Boston, MA, USA.
Stat Methods Med Res. 2016 Oct;25(5):2315-2336. doi: 10.1177/0962280214521341. Epub 2014 Feb 12.
Statistical approaches for estimating treatment effectiveness commonly model the endpoint, or the propensity score, using parametric regressions such as generalised linear models. Misspecification of these models can lead to biased parameter estimates. We compare two approaches that combine the propensity score and the endpoint regression, and can make weaker modelling assumptions, by using machine learning approaches to estimate the regression function and the propensity score. Targeted maximum likelihood estimation is a double-robust method designed to reduce bias in the estimate of the parameter of interest. Bias-corrected matching reduces bias due to covariate imbalance between matched pairs by using regression predictions. We illustrate the methods in an evaluation of different types of hip prosthesis on the health-related quality of life of patients with osteoarthritis. We undertake a simulation study, grounded in the case study, to compare the relative bias, efficiency and confidence interval coverage of the methods. We consider data generating processes with non-linear functional form relationships, normal and non-normal endpoints. We find that across the circumstances considered, bias-corrected matching generally reported less bias, but higher variance than targeted maximum likelihood estimation. When either targeted maximum likelihood estimation or bias-corrected matching incorporated machine learning, bias was much reduced, compared to using misspecified parametric models.
用于估计治疗效果的统计方法通常使用广义线性模型等参数回归对终点或倾向得分进行建模。这些模型的错误设定可能导致参数估计出现偏差。我们比较了两种将倾向得分和终点回归相结合的方法,通过使用机器学习方法来估计回归函数和倾向得分,这两种方法可以做出更弱的建模假设。靶向最大似然估计是一种双重稳健方法,旨在减少感兴趣参数估计中的偏差。偏差校正匹配通过使用回归预测来减少匹配对之间协变量不平衡导致的偏差。我们在一项关于不同类型髋关节假体对骨关节炎患者健康相关生活质量影响的评估中展示了这些方法。我们基于案例研究进行了一项模拟研究,以比较这些方法的相对偏差、效率和置信区间覆盖范围。我们考虑具有非线性函数形式关系、正态和非正态终点的数据生成过程。我们发现,在所考虑的各种情况下,偏差校正匹配通常报告的偏差较小,但方差高于靶向最大似然估计。与使用错误设定的参数模型相比,当靶向最大似然估计或偏差校正匹配纳入机器学习时,偏差会大大减少。