• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于倾向得分匹配和加权的机器学习算法与协变量平衡度量的比较

A comparison of machine learning algorithms and covariate balance measures for propensity score matching and weighting.

作者信息

Cannas Massimo, Arpino Bruno

机构信息

Department of Economic and Business Sciences, University of Cagliari, Cagliari, Italy.

Department of Statistics, Computer Science, Applications, University of Firenze, Firenze, Italy.

出版信息

Biom J. 2019 Jul;61(4):1049-1072. doi: 10.1002/bimj.201800132. Epub 2019 May 14.

DOI:10.1002/bimj.201800132
PMID:31090108
Abstract

Propensity score matching (PSM) and propensity score weighting (PSW) are popular tools to estimate causal effects in observational studies. We address two open issues: how to estimate propensity scores and assess covariate balance. Using simulations, we compare the performance of PSM and PSW based on logistic regression and machine learning algorithms (CART; Bagging; Boosting; Random Forest; Neural Networks; naive Bayes). Additionally, we consider several measures of covariate balance (Absolute Standardized Average Mean (ASAM) with and without interactions; measures based on the quantile-quantile plots; ratio between variances of propensity scores; area under the curve (AUC)) and assess their ability in predicting the bias of PSM and PSW estimators. We also investigate the importance of tuning of machine learning parameters in the context of propensity score methods. Two simulation designs are employed. In the first, the generating processes are inspired to birth register data used to assess the effect of labor induction on the occurrence of caesarean section. The second exploits more general generating mechanisms. Overall, among the different techniques, random forests performed the best, especially in PSW. Logistic regression and neural networks also showed an excellent performance similar to that of random forests. As for covariate balance, the simplest and commonly used metric, the ASAM, showed a strong correlation with the bias of causal effects estimators. Our findings suggest that researchers should aim at obtaining an ASAM lower than 10% for as many variables as possible. In the empirical study we found that labor induction had a small and not statistically significant impact on caesarean section.

摘要

倾向得分匹配(PSM)和倾向得分加权(PSW)是在观察性研究中估计因果效应的常用工具。我们解决两个未解决的问题:如何估计倾向得分以及评估协变量平衡。通过模拟,我们比较了基于逻辑回归和机器学习算法(分类与回归树;装袋法;提升法;随机森林;神经网络;朴素贝叶斯)的PSM和PSW的性能。此外,我们考虑了几种协变量平衡的度量方法(有无交互作用的绝对标准化平均均值(ASAM);基于分位数-分位数图的度量方法;倾向得分方差之间的比率;曲线下面积(AUC)),并评估它们预测PSM和PSW估计量偏差的能力。我们还研究了在倾向得分方法背景下调整机器学习参数的重要性。采用了两种模拟设计。第一种设计中,生成过程的灵感来源于用于评估引产对剖宫产发生率影响的出生登记数据。第二种设计利用了更一般的生成机制。总体而言,在不同技术中,随机森林表现最佳,尤其是在PSW中。逻辑回归和神经网络也表现出与随机森林相似的出色性能。至于协变量平衡,最简单且常用的度量指标ASAM与因果效应估计量的偏差显示出很强的相关性。我们的研究结果表明,研究人员应尽可能使尽可能多的变量的ASAM低于10%。在实证研究中,我们发现引产对剖宫产的影响较小且无统计学意义。

相似文献

1
A comparison of machine learning algorithms and covariate balance measures for propensity score matching and weighting.用于倾向得分匹配和加权的机器学习算法与协变量平衡度量的比较
Biom J. 2019 Jul;61(4):1049-1072. doi: 10.1002/bimj.201800132. Epub 2019 May 14.
2
The Balance Super Learner: A robust adaptation of the Super Learner to improve estimation of the average treatment effect in the treated based on propensity score matching.平衡超级学习者:超级学习者的稳健自适应方法,可提高基于倾向评分匹配的处理组平均处理效应估计的稳健性。
Stat Methods Med Res. 2018 Aug;27(8):2504-2518. doi: 10.1177/0962280216682055. Epub 2016 Dec 15.
3
Using machine learning to assess covariate balance in matching studies.利用机器学习评估匹配研究中的协变量平衡。
J Eval Clin Pract. 2016 Dec;22(6):844-850. doi: 10.1111/jep.12538. Epub 2016 Mar 23.
4
Applied comparison of large-scale propensity score matching and cardinality matching for causal inference in observational research.应用大规模倾向评分匹配和基数匹配在观察性研究中的因果推断的比较。
BMC Med Res Methodol. 2021 May 24;21(1):109. doi: 10.1186/s12874-021-01282-1.
5
Prognostic score-based model averaging approach for propensity score estimation.基于预后评分的模型平均倾向评分估计方法。
BMC Med Res Methodol. 2024 Oct 3;24(1):228. doi: 10.1186/s12874-024-02350-y.
6
Estimating causal effects for survival (time-to-event) outcomes by combining classification tree analysis and propensity score weighting.通过结合分类树分析和倾向得分加权来估计生存(事件发生时间)结局的因果效应。
J Eval Clin Pract. 2018 Apr;24(2):380-387. doi: 10.1111/jep.12859. Epub 2017 Dec 12.
7
Using classification tree analysis to generate propensity score weights.使用分类树分析生成倾向得分权重。
J Eval Clin Pract. 2017 Aug;23(4):703-712. doi: 10.1111/jep.12744. Epub 2017 Mar 28.
8
Combining machine learning and matching techniques to improve causal inference in program evaluation.结合机器学习和匹配技术以改进项目评估中的因果推断。
J Eval Clin Pract. 2016 Dec;22(6):864-870. doi: 10.1111/jep.12592. Epub 2016 Jun 29.
9
Propensity score matching with clustered data. An application to the estimation of the impact of caesarean section on the Apgar score.聚类数据的倾向得分匹配:剖宫产对阿氏评分影响估计的应用
Stat Med. 2016 May 30;35(12):2074-91. doi: 10.1002/sim.6880. Epub 2016 Feb 1.
10
The effect of labour induction on the risk of caesarean delivery: using propensity scores to control confounding by indication.引产对剖宫产风险的影响:使用倾向评分控制混杂因素。
BJOG. 2016 Aug;123(9):1521-9. doi: 10.1111/1471-0528.13682. Epub 2015 Sep 28.

引用本文的文献

1
Use of Machine Learning to Compare Disease Risk Scores and Propensity Scores Across Complex Confounding Scenarios: A Simulation Study.利用机器学习比较复杂混杂情况下的疾病风险评分和倾向评分:一项模拟研究。
Pharmacoepidemiol Drug Saf. 2025 Jun;34(6):e70165. doi: 10.1002/pds.70165.
2
Impact of missing electronic fetal monitoring signals on perinatal asphyxia: a multicohort analysis.电子胎儿监护信号缺失对围产期窒息的影响:一项多队列分析
NPJ Digit Med. 2025 May 1;8(1):233. doi: 10.1038/s41746-025-01665-4.
3
Guidelines and Best Practices for the Use of Targeted Maximum Likelihood and Machine Learning When Estimating Causal Effects of Exposures on Time-To-Event Outcomes.
估计暴露因素对事件发生时间结局的因果效应时使用靶向最大似然法和机器学习的指南与最佳实践
Stat Med. 2025 Mar 15;44(6):e70034. doi: 10.1002/sim.70034.
4
Robust propensity score estimation via loss function calibration.通过损失函数校准进行稳健的倾向得分估计。
Stat Methods Med Res. 2025 Mar;34(3):457-472. doi: 10.1177/09622802241308709. Epub 2025 Feb 12.
5
Predicting vaginal delivery after labor induction using machine learning: Development of a multivariable prediction model.使用机器学习预测引产术后的阴道分娩:多变量预测模型的开发
Acta Obstet Gynecol Scand. 2025 Jan;104(1):164-173. doi: 10.1111/aogs.14953. Epub 2024 Nov 27.
6
Machine Learning Algorithms to Estimate Propensity Scores in Health Policy Evaluation: A Scoping Review.机器学习算法在健康政策评估中估算倾向评分的应用:范围综述。
Int J Environ Res Public Health. 2024 Nov 7;21(11):1484. doi: 10.3390/ijerph21111484.
7
Machine learning methods for propensity and disease risk score estimation in high-dimensional data: a plasmode simulation and real-world data cohort analysis.高维数据中倾向和疾病风险评分估计的机器学习方法:模式模拟与真实世界数据队列分析
Front Pharmacol. 2024 Oct 28;15:1395707. doi: 10.3389/fphar.2024.1395707. eCollection 2024.
8
Integrating ensemble and machine learning models for early prediction of pneumonia mortality using laboratory tests.整合集成模型和机器学习模型以利用实验室检查对肺炎死亡率进行早期预测。
Heliyon. 2024 Jul 14;10(14):e34525. doi: 10.1016/j.heliyon.2024.e34525. eCollection 2024 Jul 30.
9
An improved multiply robust estimator for the average treatment effect.一种改进的平均处理效应的多重稳健估计量。
BMC Med Res Methodol. 2023 Oct 11;23(1):231. doi: 10.1186/s12874-023-02056-7.
10
Directed acyclic graphs for clinical research: a tutorial.临床研究中的有向无环图:教程
J Minim Invasive Surg. 2023 Sep 15;26(3):97-107. doi: 10.7602/jmis.2023.26.3.97.