Division of Pharmacoepidemiology and Pharmacoeconomics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02120, United States.
Division of Biostatistics, School of Public Health, University of California, Berkeley, Berkeley, CA 94720, United States.
Am J Epidemiol. 2024 Nov 4;193(11):1632-1640. doi: 10.1093/aje/kwae023.
Least absolute shrinkage and selection operator (LASSO) regression is widely used for large-scale propensity score (PS) estimation in health-care database studies. In these settings, previous work has shown that undersmoothing (overfitting) LASSO PS models can improve confounding control, but it can also cause problems of nonoverlap in covariate distributions. It remains unclear how to select the degree of undersmoothing when fitting large-scale LASSO PS models to improve confounding control while avoiding issues that can result from reduced covariate overlap. Here, we used simulations to evaluate the performance of using collaborative-controlled targeted learning to data-adaptively select the degree of undersmoothing when fitting large-scale PS models within both singly and doubly robust frameworks to reduce bias in causal estimators. Simulations showed that collaborative learning can data-adaptively select the degree of undersmoothing to reduce bias in estimated treatment effects. Results further showed that when fitting undersmoothed LASSO PS models, the use of cross-fitting was important for avoiding nonoverlap in covariate distributions and reducing bias in causal estimates.
最小绝对收缩和选择算子(LASSO)回归广泛用于医疗保健数据库研究中的大规模倾向评分(PS)估计。在这些环境下,之前的工作表明,过度平滑(过度拟合)LASSO PS 模型可以改善混杂控制,但也可能导致协变量分布的非重叠问题。当拟合大规模 LASSO PS 模型以改善混杂控制时,如何选择过度平滑的程度,同时避免因协变量重叠减少而导致的问题,目前仍不清楚。在这里,我们使用模拟来评估在单稳健和双稳健框架内使用协作控制靶向学习来数据自适应地选择过度平滑程度的性能,以减少因果估计偏差。模拟结果表明,协作学习可以数据自适应地选择过度平滑的程度,以减少估计治疗效果的偏差。结果还表明,当拟合过度平滑的 LASSO PS 模型时,使用交叉拟合对于避免协变量分布的非重叠和减少因果估计的偏差很重要。