• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用概率机进行风险估计。

Risk estimation using probability machines.

机构信息

Clinical Trials and Outcomes Branch, National Institute of Arthritis, Musculoskeletal and Skin Diseases, National Institutes of Health, Room 4-1350, Bldg 10 CRC, 10 Center Drive, Bethesda, MD 20892-1468, USA.

出版信息

BioData Min. 2014 Mar 1;7(1):2. doi: 10.1186/1756-0381-7-2.

DOI:10.1186/1756-0381-7-2
PMID:24581306
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4015350/
Abstract

BACKGROUND

Logistic regression has been the de facto, and often the only, model used in the description and analysis of relationships between a binary outcome and observed features. It is widely used to obtain the conditional probabilities of the outcome given predictors, as well as predictor effect size estimates using conditional odds ratios.

RESULTS

We show how statistical learning machines for binary outcomes, provably consistent for the nonparametric regression problem, can be used to provide both consistent conditional probability estimation and conditional effect size estimates. Effect size estimates from learning machines leverage our understanding of counterfactual arguments central to the interpretation of such estimates. We show that, if the data generating model is logistic, we can recover accurate probability predictions and effect size estimates with nearly the same efficiency as a correct logistic model, both for main effects and interactions. We also propose a method using learning machines to scan for possible interaction effects quickly and efficiently. Simulations using random forest probability machines are presented.

CONCLUSIONS

The models we propose make no assumptions about the data structure, and capture the patterns in the data by just specifying the predictors involved and not any particular model structure. So they do not run the same risks of model mis-specification and the resultant estimation biases as a logistic model. This methodology, which we call a "risk machine", will share properties from the statistical machine that it is derived from.

摘要

背景

逻辑回归一直是描述和分析二项结果与观测特征之间关系的事实上的(通常也是唯一的)模型。它被广泛用于获得给定预测器的结果的条件概率,以及使用条件优势比的预测器效果大小估计。

结果

我们展示了如何使用用于二项结果的统计学习机器,这些机器在非参数回归问题上是可证明一致的,可以同时提供一致的条件概率估计和条件效果大小估计。学习机器的效果大小估计利用了我们对反事实论点的理解,这些论点是解释此类估计的核心。我们表明,如果数据生成模型是逻辑的,我们可以使用几乎与正确逻辑模型相同的效率来恢复准确的概率预测和效果大小估计,无论是主效应还是交互作用。我们还提出了一种使用学习机器快速有效地扫描可能的交互作用的方法。展示了使用随机森林概率机器进行的模拟。

结论

我们提出的模型对数据结构没有任何假设,而是通过仅指定涉及的预测器而不是任何特定的模型结构来捕捉数据中的模式。因此,它们不会像逻辑模型那样面临模型误设和由此产生的估计偏差的相同风险。这种我们称之为“风险机”的方法将具有其源自的统计机器的一些特性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bdee/4015350/1c6f628a569d/1756-0381-7-2-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bdee/4015350/2bf51b7356b6/1756-0381-7-2-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bdee/4015350/642c6bb9eeaa/1756-0381-7-2-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bdee/4015350/69c01521ccf9/1756-0381-7-2-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bdee/4015350/652644bab495/1756-0381-7-2-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bdee/4015350/a63bb4e03da2/1756-0381-7-2-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bdee/4015350/e845295ecc1e/1756-0381-7-2-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bdee/4015350/1c6f628a569d/1756-0381-7-2-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bdee/4015350/2bf51b7356b6/1756-0381-7-2-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bdee/4015350/642c6bb9eeaa/1756-0381-7-2-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bdee/4015350/69c01521ccf9/1756-0381-7-2-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bdee/4015350/652644bab495/1756-0381-7-2-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bdee/4015350/a63bb4e03da2/1756-0381-7-2-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bdee/4015350/e845295ecc1e/1756-0381-7-2-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bdee/4015350/1c6f628a569d/1756-0381-7-2-7.jpg

相似文献

1
Risk estimation using probability machines.使用概率机进行风险估计。
BioData Min. 2014 Mar 1;7(1):2. doi: 10.1186/1756-0381-7-2.
2
Probability machines: consistent probability estimation using nonparametric learning machines.概率机器:使用非参数学习机器进行一致概率估计。
Methods Inf Med. 2012;51(1):74-81. doi: 10.3414/ME00-01-0052. Epub 2011 Sep 14.
3
Probability estimation with machine learning methods for dichotomous and multicategory outcome: theory.使用机器学习方法对二分类和多分类结果进行概率估计:理论
Biom J. 2014 Jul;56(4):534-63. doi: 10.1002/bimj.201300068. Epub 2014 Jan 29.
4
Calibrating random forests for probability estimation.校准随机森林以进行概率估计。
Stat Med. 2016 Sep 30;35(22):3949-60. doi: 10.1002/sim.6959. Epub 2016 Apr 13.
5
Estimation in regression models for longitudinal binary data with outcome-dependent follow-up.具有结果依赖随访的纵向二元数据回归模型中的估计
Biostatistics. 2006 Jul;7(3):469-85. doi: 10.1093/biostatistics/kxj019. Epub 2006 Jan 20.
6
Sequentially Estimating the Approximate Conditional Mean Using Extreme Learning Machines.使用极限学习机顺序估计近似条件均值。
Entropy (Basel). 2020 Nov 13;22(11):1294. doi: 10.3390/e22111294.
7
Understanding overfitting in random forest for probability estimation: a visualization and simulation study.理解随机森林在概率估计中的过拟合:可视化与模拟研究。
Diagn Progn Res. 2024 Sep 27;8(1):14. doi: 10.1186/s41512-024-00177-1.
8
Simple approaches to assess the possible impact of missing outcome information on estimates of risk ratios, odds ratios, and risk differences.评估缺失结局信息对风险比、比值比和风险差估计值可能产生的影响的简单方法。
Control Clin Trials. 2003 Aug;24(4):411-21. doi: 10.1016/s0197-2456(03)00021-7.
9
Probability estimation with machine learning methods for dichotomous and multicategory outcome: applications.使用机器学习方法进行二分类和多分类结果的概率估计:应用
Biom J. 2014 Jul;56(4):564-83. doi: 10.1002/bimj.201300077. Epub 2014 Feb 12.
10
Firth's logistic regression with rare events: accurate effect estimates and predictions?针对罕见事件的费思逻辑回归:准确的效应估计与预测?
Stat Med. 2017 Jun 30;36(14):2302-2317. doi: 10.1002/sim.7273. Epub 2017 Mar 12.

引用本文的文献

1
Nonarteritic Anterior Ischemic Optic Neuropathy in Black Patients.黑人患者的非动脉炎性前部缺血性视神经病变
Am J Ophthalmol. 2025 Feb;270:192-202. doi: 10.1016/j.ajo.2024.09.036. Epub 2024 Oct 15.
2
Disparities in Salmonellosis Incidence for US Counties with Different Social Determinants of Health Profiles Are Also Mediated by Extreme Weather: A Counterfactual Analysis of Laboratory Enteric Disease Surveillance (LEDS) Data From 1997 through 2019.不同健康状况社会决定因素的美国各县沙门氏菌病发病率差异也受极端天气影响:对1997年至2019年实验室肠道疾病监测(LEDS)数据的反事实分析
J Food Prot. 2024 Dec;87(12):100379. doi: 10.1016/j.jfp.2024.100379. Epub 2024 Oct 15.
3

本文引用的文献

1
Comparative validation of the D. melanogaster modENCODE transcriptome annotation.黑腹果蝇modENCODE转录组注释的比较验证。
Genome Res. 2014 Jul;24(7):1209-23. doi: 10.1101/gr.159384.113.
2
Probability machines: consistent probability estimation using nonparametric learning machines.概率机器:使用非参数学习机器进行一致概率估计。
Methods Inf Med. 2012;51(1):74-81. doi: 10.3414/ME00-01-0052. Epub 2011 Sep 14.
Estimating Individual Treatment Effect in Observational Data Using Random Forest Methods.
使用随机森林方法估计观察性数据中的个体治疗效果。
J Comput Graph Stat. 2018;27(1):209-219. doi: 10.1080/10618600.2017.1356325. Epub 2018 Feb 1.
4
The optimal crowd learning machine.最优群体学习机器。
BioData Min. 2017 May 19;10:16. doi: 10.1186/s13040-017-0135-7. eCollection 2017.
5
A pilot study investigating changes in neural processing after mindfulness training in elite athletes.一项关于精英运动员正念训练后神经加工变化的初步研究。
Front Behav Neurosci. 2015 Aug 27;9:229. doi: 10.3389/fnbeh.2015.00229. eCollection 2015.
6
O brave new world that has such machines in it.啊,这个有这样机器的美丽新世界。
BioData Min. 2014 Nov 17;7:26. doi: 10.1186/1756-0381-7-26. eCollection 2014.
7
Gene-Gene Interaction Among WNT Genes for Oral Cleft in Trios.三人组中 WNT 基因间的基因-基因相互作用与口腔裂隙
Genet Epidemiol. 2015 Jul;39(5):385-94. doi: 10.1002/gepi.21888. Epub 2015 Feb 6.
8
Variable selection method for the identification of epistatic models.用于识别上位性模型的变量选择方法。
Pac Symp Biocomput. 2015;20:195-206.
9
First complex, then simple.先复杂,后简单。
BioData Min. 2014 Jul 18;7:13. doi: 10.1186/1756-0381-7-13. eCollection 2014.