文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

统计学习与选择性推断。

Statistical learning and selective inference.

作者信息

Taylor Jonathan, Tibshirani Robert J

机构信息

Department of Statistics, Stanford University, Stanford, CA 94305;

Department of Health Research & Policy and Department of Statistics, Stanford University, Stanford, CA 94305

出版信息

Proc Natl Acad Sci U S A. 2015 Jun 23;112(25):7629-34. doi: 10.1073/pnas.1507583112.


DOI:10.1073/pnas.1507583112
PMID:26100887
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4485109/
Abstract

We describe the problem of "selective inference." This addresses the following challenge: Having mined a set of data to find potential associations, how do we properly assess the strength of these associations? The fact that we have "cherry-picked"--searched for the strongest associations--means that we must set a higher bar for declaring significant the associations that we see. This challenge becomes more important in the era of big data and complex statistical modeling. The cherry tree (dataset) can be very large and the tools for cherry picking (statistical learning methods) are now very sophisticated. We describe some recent new developments in selective inference and illustrate their use in forward stepwise regression, the lasso, and principal components analysis.

摘要

我们描述了“选择性推断”问题。这解决了以下挑战:在挖掘一组数据以寻找潜在关联后,我们如何恰当地评估这些关联的强度?我们进行了“挑选”——寻找最强关联——这一事实意味着,对于我们所发现的关联,我们必须设定更高的标准来判定其具有显著性。在大数据和复杂统计建模的时代,这一挑战变得更加重要。樱桃树(数据集)可能非常大,而用于挑选的工具(统计学习方法)如今也非常复杂。我们描述了选择性推断方面的一些最新进展,并举例说明了它们在向前逐步回归、套索法和主成分分析中的应用。

相似文献

[1]
Statistical learning and selective inference.

Proc Natl Acad Sci U S A. 2015-6-23

[2]
Sparse regression and marginal testing using cluster prototypes.

Biostatistics. 2016-4

[3]
Evaluating methods for Lasso selective inference in biomedical research: a comparative simulation study.

BMC Med Res Methodol. 2022-7-26

[4]
Big data uncertainties.

J Forensic Leg Med. 2018-7

[5]
Reducing the complexity of high-dimensional environmental data: An analytical framework using LASSO with considerations of confounding for statistical inference.

Int J Hyg Environ Health. 2023-4

[6]
ProteinLasso: A Lasso regression approach to protein inference problem in shotgun proteomics.

Comput Biol Chem. 2013-1-12

[7]
From Correlation to Causality: Statistical Approaches to Learning Regulatory Relationships in Large-Scale Biomolecular Investigations.

J Proteome Res. 2016-3-4

[8]
Feature sensitivity criterion-based sampling strategy from the Optimization based on Phylogram Analysis (Fs-OPA) and Cox regression applied to mental disorder datasets.

PLoS One. 2020-7-1

[9]
Machine learning, statistical learning and the future of biological research in psychiatry.

Psychol Med. 2016-9

[10]
A Ranking Approach to Genomic Selection.

PLoS One. 2015-6-12

引用本文的文献

[1]
Winner's curse in rare variant analysis: effect size estimation bias depends on effect direction and the association method used.

Front Genet. 2025-8-8

[2]
Post-selection inference for the Cox model with interval-censored data.

Scand Stat Theory Appl. 2025-6

[3]
Replicated blood-based biomarkers for myalgic encephalomyelitis not explicable by inactivity.

EMBO Mol Med. 2025-6-20

[4]
Associations of CO reactivity and orexin activity with extinction memory to fear and reward cues: results from a large sample of male rats across multiple studies.

Physiol Behav. 2025-9-1

[5]
Generalized data thinning using sufficient statistics.

J Am Stat Assoc. 2025

[6]
Comparing conventional and Bayesian workflows for clinical outcome prediction modelling with an exemplar cohort study of severe COVID-19 infection incorporating clinical biomarker test results.

BMC Med Inform Decis Mak. 2025-3-10

[7]
Diabetic peripheral neuropathy detection of type 2 diabetes using machine learning from TCM features: a cross-sectional study.

BMC Med Inform Decis Mak. 2025-2-18

[8]
An integrated multi-omics analysis identifies novel regulators of circadian rhythm and sleep disruptions under unique light environment in Antarctica.

Mol Psychiatry. 2025-6

[9]
Higher-Order Disease Interactions in Multimorbidity Measurement: Marginal Benefit Over Additive Disease Summation.

J Gerontol A Biol Sci Med Sci. 2024-12-11

[10]
-Penalized Multinomial Regression: Estimation, Inference, and Prediction, With an Application to Risk Factor Identification for Different Dementia Subtypes.

Stat Med. 2024-12-30

本文引用的文献

[1]
A SIGNIFICANCE TEST FOR THE LASSO.

Ann Stat. 2014-4

[2]
HIGH DIMENSIONAL VARIABLE SELECTION.

Ann Stat. 2009-1-1

[3]
Why most published research findings are false.

PLoS Med. 2005-8

[4]
Statistical significance for genomewide studies.

Proc Natl Acad Sci U S A. 2003-8-5

[5]
Human immunodeficiency virus reverse transcriptase and protease sequence database.

Nucleic Acids Res. 2003-1-1

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索