• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

考虑使用基于树的机器学习来评估人口统计学和环境风险因素与健康结果之间的因果关系。

Considerations for using tree-based machine learning to assess causation between demographic and environmental risk factors and health outcomes.

机构信息

Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, Canada.

Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, Canada.

出版信息

Environ Sci Pollut Res Int. 2024 Nov;31(51):60927-60935. doi: 10.1007/s11356-024-35304-4. Epub 2024 Oct 12.

DOI:10.1007/s11356-024-35304-4
PMID:39394473
Abstract

Evaluation of the heterogeneous treatment effect (HTE) allows for the assessment of the causal effect of a therapy or intervention while considering heterogeneity in individual factors within a population. Machine learning (ML) methods have previously been employed for HTE evaluation, addressing the limitations associated with modelling complex systems. In this work, three tree-based ML algorithms, causal random forest (CRF), causal Bayesian additive regression trees (CBART), and causal rule ensemble (CRE), are used to analyze the potential causation of benzene exposure to cause childhood acute myeloid leukemia (AML). Data for this analysis is generated by drawing samples from a previously developed model that estimates AML probability given as input demographic information and benzene exposure. Comparison is drawn between the three tree-based algorithms in terms of the predicted average treatment effect (ATE), the regression coefficient of determination, and the computational time of each algorithm. Minimal difference is reported between the three tree-based algorithms in terms of the ATE, as well as the regression coefficient of determination. However, CRF outperforms CBART in terms of algorithm computational time. Moreover, CRF allows for both continuous and binary treatment variables, as opposed to CBART and CRE, making it better suited to environmental health studies, where exposure levels of pollutants shall be considered continuous. Following the comparison of all three algorithms, the influence of adding Gaussian noise to the treatment and outcome variables, as well as outliers, is investigated using CRF. A set of considerations is drawn to guide researchers in using these algorithms. These considerations detail the simulation settings, applications, and results interpretation and aim to provide prompt information in decision-making surrounding the establishment of pollutant exposure thresholds in environmental risk assessments.

摘要

评价异质处理效应(HTE)可以在考虑个体因素在人群中的异质性的情况下,评估治疗或干预的因果效应。机器学习(ML)方法以前曾用于 HTE 评估,解决了与建模复杂系统相关的局限性。在这项工作中,使用了三种基于树的 ML 算法,因果随机森林(CRF)、因果贝叶斯加法回归树(CBART)和因果规则集成(CRE),来分析苯暴露导致儿童急性髓系白血病(AML)的潜在因果关系。该分析的数据是通过从先前开发的模型中抽取样本生成的,该模型根据输入的人口统计信息和苯暴露情况来估计 AML 的概率。从预测平均治疗效果(ATE)、回归系数确定度和每个算法的计算时间等方面对三种基于树的算法进行了比较。三种基于树的算法在 ATE 和回归系数确定度方面差异很小。然而,CRF 在算法计算时间方面优于 CBART。此外,CRF 允许处理和结果变量为连续和二进制,而 CBART 和 CRE 则不允许,这使得它更适合于环境健康研究,其中污染物的暴露水平应被视为连续的。在比较了所有三种算法之后,使用 CRF 研究了向处理和结果变量添加高斯噪声以及异常值的影响。得出了一组考虑因素,以指导研究人员使用这些算法。这些考虑因素详细说明了模拟设置、应用以及结果解释,并旨在为环境风险评估中建立污染物暴露阈值的决策提供及时的信息。

相似文献

1
Considerations for using tree-based machine learning to assess causation between demographic and environmental risk factors and health outcomes.考虑使用基于树的机器学习来评估人口统计学和环境风险因素与健康结果之间的因果关系。
Environ Sci Pollut Res Int. 2024 Nov;31(51):60927-60935. doi: 10.1007/s11356-024-35304-4. Epub 2024 Oct 12.
2
Comparison of machine learning clustering algorithms for detecting heterogeneity of treatment effect in acute respiratory distress syndrome: A secondary analysis of three randomised controlled trials.机器学习聚类算法在急性呼吸窘迫综合征治疗效果异质性检测中的比较:三项随机对照试验的二次分析。
EBioMedicine. 2021 Dec;74:103697. doi: 10.1016/j.ebiom.2021.103697. Epub 2021 Dec 1.
3
Interventional probability of causation (IPoC) with epidemiological and partial mechanistic evidence: benzene vs. formaldehyde and acute myeloid leukemia (AML).具有流行病学和部分机制证据的介入性因果关系概率 (IPoC):苯与甲醛和急性髓系白血病 (AML)。
Crit Rev Toxicol. 2024 Apr;54(4):252-289. doi: 10.1080/10408444.2024.2337435. Epub 2024 May 16.
4
Bayesian additive regression trees for predicting childhood asthma in the CHILD cohort study.贝叶斯加法回归树在儿童队列研究中预测儿童哮喘。
BMC Med Res Methodol. 2024 Nov 1;24(1):262. doi: 10.1186/s12874-024-02376-2.
5
Outcome risk model development for heterogeneity of treatment effect analyses: a comparison of non-parametric machine learning methods and semi-parametric statistical methods.治疗效果分析异质性的结局风险模型开发:非参数机器学习方法与半参数统计方法的比较
BMC Med Res Methodol. 2024 Jul 23;24(1):158. doi: 10.1186/s12874-024-02265-8.
6
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
7
Heterogeneous treatment effect analysis based on machine-learning methodology.基于机器学习方法的异质处理效应分析。
CPT Pharmacometrics Syst Pharmacol. 2021 Nov;10(11):1433-1443. doi: 10.1002/psp4.12715. Epub 2021 Oct 30.
8
Comparative analysis of weka-based classification algorithms on medical diagnosis datasets.基于 WEKA 的分类算法在医学诊断数据集上的比较分析。
Technol Health Care. 2023;31(S1):397-408. doi: 10.3233/THC-236034.
9
Part 1. Statistical Learning Methods for the Effects of Multiple Air Pollution Constituents.第1部分. 多种空气污染成分影响的统计学习方法
Res Rep Health Eff Inst. 2015 Jun(183 Pt 1-2):5-50.
10
Causal Artificial Intelligence Models of Food Quality Data.食品质量数据的因果人工智能模型。
Food Technol Biotechnol. 2024 Mar;62(1):102-109. doi: 10.17113/ftb.62.01.24.8301.