• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

随机森林在匹配病例对照研究中的分析。

Random forests for the analysis of matched case-control studies.

机构信息

Chair of Epidemiology, TUM School of Medicine and Health, Technical University of Munich, Munich, Germany.

Institute of Medical Biometry, Informatics and Epidemiology, Faculty of Medicine, University of Bonn, Bonn, Germany.

出版信息

BMC Bioinformatics. 2024 Aug 1;25(1):253. doi: 10.1186/s12859-024-05877-5.

DOI:10.1186/s12859-024-05877-5
PMID:39090608
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11292918/
Abstract

BACKGROUND

Conditional logistic regression trees have been proposed as a flexible alternative to the standard method of conditional logistic regression for the analysis of matched case-control studies. While they allow to avoid the strict assumption of linearity and automatically incorporate interactions, conditional logistic regression trees may suffer from a relatively high variability. Further machine learning methods for the analysis of matched case-control studies are missing because conventional machine learning methods cannot handle the matched structure of the data.

RESULTS

A random forest method for the analysis of matched case-control studies based on conditional logistic regression trees is proposed, which overcomes the issue of high variability. It provides an accurate estimation of exposure effects while being more flexible in the functional form of covariate effects. The efficacy of the method is illustrated in a simulation study and within an application to real-world data from a matched case-control study on the effect of regular participation in cervical cancer screening on the development of cervical cancer.

CONCLUSIONS

The proposed random forest method is a promising add-on to the toolbox for the analysis of matched case-control studies and addresses the need for machine-learning methods in this field. It provides a more flexible approach compared to the standard method of conditional logistic regression, but also compared to conditional logistic regression trees. It allows for non-linearity and the automatic inclusion of interaction effects and is suitable both for exploratory and explanatory analyses.

摘要

背景

条件逻辑回归树已被提议作为匹配病例对照研究分析中标准条件逻辑回归方法的一种灵活替代方法。虽然它们允许避免线性的严格假设,并自动纳入交互作用,但条件逻辑回归树可能会受到相对较高的可变性的影响。由于常规机器学习方法无法处理数据的匹配结构,因此缺少用于匹配病例对照研究分析的其他机器学习方法。

结果

提出了一种基于条件逻辑回归树的匹配病例对照研究的随机森林方法,该方法克服了高变异性的问题。它在协变量效应的函数形式上更灵活的同时,提供了暴露效应的准确估计。该方法的有效性在模拟研究和对基于匹配病例对照研究的关于定期参与宫颈癌筛查对宫颈癌发展影响的真实数据的应用中得到了说明。

结论

所提出的随机森林方法是匹配病例对照研究分析工具包的一个有前途的补充,满足了该领域对机器学习方法的需求。与条件逻辑回归标准方法相比,它提供了更灵活的方法,但与条件逻辑回归树相比也是如此。它允许非线性和自动纳入交互作用,既适用于探索性分析,也适用于解释性分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4596/11292918/dabf309ed980/12859_2024_5877_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4596/11292918/a1d0945a48fc/12859_2024_5877_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4596/11292918/2456e2e0b506/12859_2024_5877_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4596/11292918/880fa0bc4623/12859_2024_5877_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4596/11292918/8935dccbfb9b/12859_2024_5877_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4596/11292918/a459bcd9622c/12859_2024_5877_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4596/11292918/d65ea2141a1d/12859_2024_5877_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4596/11292918/dabf309ed980/12859_2024_5877_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4596/11292918/a1d0945a48fc/12859_2024_5877_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4596/11292918/2456e2e0b506/12859_2024_5877_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4596/11292918/880fa0bc4623/12859_2024_5877_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4596/11292918/8935dccbfb9b/12859_2024_5877_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4596/11292918/a459bcd9622c/12859_2024_5877_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4596/11292918/d65ea2141a1d/12859_2024_5877_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4596/11292918/dabf309ed980/12859_2024_5877_Fig7_HTML.jpg

相似文献

1
Random forests for the analysis of matched case-control studies.随机森林在匹配病例对照研究中的分析。
BMC Bioinformatics. 2024 Aug 1;25(1):253. doi: 10.1186/s12859-024-05877-5.
2
A tree-based modeling approach for matched case-control studies.一种用于匹配病例对照研究的基于树的建模方法。
Stat Med. 2023 Feb 28;42(5):676-692. doi: 10.1002/sim.9637. Epub 2023 Jan 11.
3
Calibrating random forests for probability estimation.校准随机森林以进行概率估计。
Stat Med. 2016 Sep 30;35(22):3949-60. doi: 10.1002/sim.6959. Epub 2016 Apr 13.
4
Flexible Machine Learning Estimation of Conditional Average Treatment Effects: A Blessing and a Curse.灵活的机器学习条件平均处理效应估计:福兮祸所伏。
Epidemiology. 2024 Jan 1;35(1):32-40. doi: 10.1097/EDE.0000000000001684. Epub 2023 Oct 25.
5
Comparison of the missing-indicator method and conditional logistic regression in 1:m matched case-control studies with missing exposure values.1:m匹配病例对照研究中缺失暴露值时,缺失指标法与条件逻辑回归的比较。
Am J Epidemiol. 2004 Mar 15;159(6):603-10. doi: 10.1093/aje/kwh075.
6
Using a cohort study of diabetes and peripheral artery disease to compare logistic regression and machine learning via random forest modeling.使用糖尿病和外周动脉疾病的队列研究比较逻辑回归和随机森林建模的机器学习。
BMC Med Res Methodol. 2022 Nov 23;22(1):300. doi: 10.1186/s12874-022-01774-8.
7
Don't dismiss logistic regression: the case for sensible extraction of interactions in the era of machine learning.不要忽视逻辑回归:在机器学习时代,明智地提取交互作用的案例。
BMC Med Res Methodol. 2020 Jun 29;20(1):171. doi: 10.1186/s12874-020-01046-3.
8
Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?预测模型工具能否识别 ACL 重建术后阿片类药物使用时间延长的高风险患者?
Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251.
9
Predicting 30-day Hospital Readmission with Publicly Available Administrative Database. A Conditional Logistic Regression Modeling Approach.利用公开可用的行政数据库预测30天再入院情况。一种条件逻辑回归建模方法。
Methods Inf Med. 2015;54(6):560-7. doi: 10.3414/ME14-02-0017. Epub 2015 Nov 9.
10
Unbiased split variable selection for random survival forests using maximally selected rank statistics.使用最大选择秩统计量对随机生存森林进行无偏分裂变量选择。
Stat Med. 2017 Apr 15;36(8):1272-1284. doi: 10.1002/sim.7212. Epub 2017 Jan 15.

引用本文的文献

1
Construction of a risk prediction model for occupational noise-induced hearing loss using routine blood and biochemical indicators in Shenzhen, China: a predictive modelling study.利用中国深圳的常规血液和生化指标构建职业性噪声性听力损失风险预测模型:一项预测建模研究
BMJ Open. 2025 Apr 28;15(4):e097249. doi: 10.1136/bmjopen-2024-097249.
2
A bibliometric analysis of artificial intelligence applied to cervical cancer.人工智能应用于宫颈癌的文献计量分析
Front Med (Lausanne). 2025 Apr 8;12:1562818. doi: 10.3389/fmed.2025.1562818. eCollection 2025.

本文引用的文献

1
Case-control matching on confounders revisited.再谈混杂因素的病例对照匹配。
Eur J Epidemiol. 2023 Oct;38(10):1025-1034. doi: 10.1007/s10654-023-01046-9. Epub 2023 Sep 14.
2
A tree-based modeling approach for matched case-control studies.一种用于匹配病例对照研究的基于树的建模方法。
Stat Med. 2023 Feb 28;42(5):676-692. doi: 10.1002/sim.9637. Epub 2023 Jan 11.
3
Impact of opportunistic screening on squamous cell and adenocarcinoma of the cervix in Germany: A population-based case-control study.德国机会性筛查对宫颈鳞癌和腺癌的影响:一项基于人群的病例对照研究。
PLoS One. 2021 Jul 14;16(7):e0253801. doi: 10.1371/journal.pone.0253801. eCollection 2021.
4
Key considerations in the design of real-world studies.真实世界研究设计中的关键考虑因素。
Contemp Clin Trials. 2020 Sep;96:106091. doi: 10.1016/j.cct.2020.106091. Epub 2020 Jul 25.
5
Matched Forest: supervised learning for high-dimensional matched case-control studies.匹配森林:高维匹配病例对照研究的监督学习。
Bioinformatics. 2020 Mar 1;36(5):1570-1576. doi: 10.1093/bioinformatics/btz785.
6
Doubly robust conditional logistic regression.双重稳健条件逻辑回归。
Stat Med. 2019 Oct 15;38(23):4749-4760. doi: 10.1002/sim.8332. Epub 2019 Aug 2.
7
Case-control matching: effects, misconceptions, and recommendations.病例对照匹配:效果、误解与建议。
Eur J Epidemiol. 2018 Jan;33(1):5-14. doi: 10.1007/s10654-017-0325-0. Epub 2017 Nov 3.
8
Analysis of matched case-control studies.匹配病例对照研究分析
BMJ. 2016 Feb 25;352:i969. doi: 10.1136/bmj.i969.
9
Regularization Paths for Conditional Logistic Regression: The clogitL1 Package.条件逻辑回归的正则化路径:clogitL1 包
J Stat Softw. 2014 Jul;58(12).
10
Sparse conditional logistic regression for analyzing large-scale matched data from epidemiological studies: a simple algorithm.用于分析流行病学研究中大规模匹配数据的稀疏条件逻辑回归:一种简单算法
BMC Bioinformatics. 2015;16 Suppl 6(Suppl 6):S1. doi: 10.1186/1471-2105-16-S6-S1. Epub 2015 Apr 17.