• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

倾向得分分析中基于随机森林的协变量缺失值插补方法比较

A comparison of random forest-based missing imputation methods for covariates in propensity score analysis.

作者信息

Lee Yongseok, Leite Walter L

机构信息

Bureau of Economic and Business Research (BEBR), University of Florida.

School of Human Development and Organizational Studies in Education, University of Florida.

出版信息

Psychol Methods. 2024 Jun 13. doi: 10.1037/met0000676.

DOI:10.1037/met0000676
PMID:38869857
Abstract

Propensity score analysis (PSA) is a prominent method to alleviate selection bias in observational studies, but missing data in covariates is prevalent and must be dealt with during propensity score estimation. Through Monte Carlo simulations, this study evaluates the use of imputation methods based on multiple random forests algorithms to handle missing data in covariates: multivariate imputation by chained equations-random forest (Caliber), proximity imputation (PI), and missForest. The results indicated that PI and missForest outperformed other methods with respect to bias of average treatment effect regardless of sample size and missing mechanisms. A demonstration of these five methods with PSA to evaluate the effect of participation in center-based care on children's reading ability is provided using data from the Early Childhood Longitudinal Study, Kindergarten Class of 2010-2011. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

摘要

倾向得分分析(PSA)是减轻观察性研究中选择偏倚的一种重要方法,但协变量中的数据缺失很普遍,在倾向得分估计过程中必须加以处理。通过蒙特卡洛模拟,本研究评估了基于多种随机森林算法的插补方法在处理协变量数据缺失方面的应用:链式方程随机森林多元插补法(Caliber)、临近插补法(PI)和missForest。结果表明,无论样本量和缺失机制如何,PI和missForest在平均治疗效果偏差方面均优于其他方法。利用2010 - 2011年幼儿园班级的幼儿纵向研究数据,展示了这五种方法结合PSA来评估参与中心式照料对儿童阅读能力的影响。(《心理学文摘数据库记录》(c)2024美国心理学会,保留所有权利)

相似文献

1
A comparison of random forest-based missing imputation methods for covariates in propensity score analysis.倾向得分分析中基于随机森林的协变量缺失值插补方法比较
Psychol Methods. 2024 Jun 13. doi: 10.1037/met0000676.
2
missForest with feature selection using binary particle swarm optimization improves the imputation accuracy of continuous data.使用二进制粒子群优化进行特征选择的 missForest 提高了连续数据的插补准确性。
Genes Genomics. 2022 Jun;44(6):651-658. doi: 10.1007/s13258-022-01247-8. Epub 2022 Apr 6.
3
Propensity score analysis with partially observed covariates: How should multiple imputation be used?倾向评分分析与部分观测协变量:应如何使用多重插补?
Stat Methods Med Res. 2019 Jan;28(1):3-19. doi: 10.1177/0962280217713032. Epub 2017 Jun 2.
4
MissForest--non-parametric missing value imputation for mixed-type data.MissForest--用于混合类型数据的非参数缺失值插补。
Bioinformatics. 2012 Jan 1;28(1):112-8. doi: 10.1093/bioinformatics/btr597. Epub 2011 Oct 28.
5
Accuracy of random-forest-based imputation of missing data in the presence of non-normality, non-linearity, and interaction.基于随机森林的缺失数据插补在非正态性、非线性和交互作用存在下的准确性。
BMC Med Res Methodol. 2020 Jul 25;20(1):199. doi: 10.1186/s12874-020-01080-1.
6
Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study.基于 MICE 使用随机森林和参数插补模型比较缺失数据插补:CALIBER 研究。
Am J Epidemiol. 2014 Mar 15;179(6):764-74. doi: 10.1093/aje/kwt312. Epub 2014 Jan 12.
7
A comparison of different methods to handle missing data in the context of propensity score analysis.不同方法在倾向评分分析中处理缺失数据的比较。
Eur J Epidemiol. 2019 Jan;34(1):23-36. doi: 10.1007/s10654-018-0447-z. Epub 2018 Oct 19.
8
Random Forest Missing Data Algorithms.随机森林缺失数据算法
Stat Anal Data Min. 2017 Dec;10(6):363-377. doi: 10.1002/sam.11348. Epub 2017 Jun 13.
9
Performance of Multiple Imputation Using Modern Machine Learning Methods in Electronic Health Records Data.基于现代机器学习方法在电子健康记录数据中的应用表现。
Epidemiology. 2023 Mar 1;34(2):206-215. doi: 10.1097/EDE.0000000000001578. Epub 2022 Dec 9.
10
Analyzing the Effect of Imputation on Classification Performance under MCAR and MAR Missing Mechanisms.分析在完全随机缺失(MCAR)和随机缺失(MAR)缺失机制下插补对分类性能的影响。
Entropy (Basel). 2023 Mar 17;25(3):521. doi: 10.3390/e25030521.

引用本文的文献

1
Associations of daily step counts with depressive symptoms during pregnancy: Ruian birth cohort study.孕期每日步数与抑郁症状的关联:瑞安出生队列研究
BMC Public Health. 2025 Aug 19;25(1):2849. doi: 10.1186/s12889-025-24181-2.
2
Prescriptive Predictors of Mindfulness Ecological Momentary Intervention for Social Anxiety Disorder: Machine Learning Analysis of Randomized Controlled Trial Data.社交焦虑障碍正念生态瞬时干预的规范性预测因素:随机对照试验数据的机器学习分析
JMIR Ment Health. 2025 May 13;12:e67210. doi: 10.2196/67210.
3
Investigating long-term risk of aortic aneurysm and dissection from fluoroquinolones and the key contributing factors using machine learning methods.
使用机器学习方法研究氟喹诺酮类药物导致主动脉瘤和主动脉夹层的长期风险及关键影响因素。
Sci Rep. 2025 Apr 16;15(1):13130. doi: 10.1038/s41598-025-97787-6.