• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

特征棱镜:高维信噪比的推断

EigenPrism: inference for high dimensional signal-to-noise ratios.

作者信息

Janson Lucas, Barber Rina Foygel, Candès Emmanuel

出版信息

J R Stat Soc Series B Stat Methodol. 2017 Sep;79(4):1037-1065. doi: 10.1111/rssb.12203. Epub 2016 Sep 16.

DOI:10.1111/rssb.12203
PMID:29104447
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5663223/
Abstract

Consider the following three important problems in statistical inference, namely, constructing confidence intervals for (1) the error of a high-dimensional ( > ) regression estimator, (2) the linear regression noise level, and (3) the genetic signal-to-noise ratio of a continuous-valued trait (related to the heritability). All three problems turn out to be closely related to the little-studied problem of performing inference on the [Formula: see text]-norm of the signal in high-dimensional linear regression. We derive a novel procedure for this, which is asymptotically correct when the covariates are multivariate Gaussian and produces valid confidence intervals in finite samples as well. The procedure, called , is computationally fast and makes no assumptions on coefficient sparsity or knowledge of the noise level. We investigate the width of the EigenPrism confidence intervals, including a comparison with a Bayesian setting in which our interval is just 5% wider than the Bayes credible interval. We are then able to unify the three aforementioned problems by showing that the EigenPrism procedure with only minor modifications is able to make important contributions to all three. We also investigate the robustness of coverage and find that the method applies in practice and in finite samples much more widely than just the case of multivariate Gaussian covariates. Finally, we apply EigenPrism to a genetic dataset to estimate the genetic signal-to-noise ratio for a number of continuous phenotypes.

摘要

考虑统计推断中的以下三个重要问题,即:为(1)高维(>)回归估计量的误差、(2)线性回归噪声水平以及(3)连续值性状的遗传信噪比(与遗传力相关)构建置信区间。事实证明,所有这三个问题都与高维线性回归中对信号的[公式:见正文]范数进行推断这个研究较少的问题密切相关。我们为此推导了一种新颖的方法,当协变量是多元高斯分布时,该方法在渐近意义上是正确的,并且在有限样本中也能产生有效的置信区间。这个方法称为EigenPrism,计算速度快,并且对系数稀疏性或噪声水平的知识不做任何假设。我们研究了EigenPrism置信区间的宽度,包括与贝叶斯设置进行比较,在贝叶斯设置中我们的区间仅比贝叶斯可信区间宽5%。然后,我们能够通过表明只需进行微小修改的EigenPrism方法就能对上述所有三个问题做出重要贡献,从而将这三个问题统一起来。我们还研究了覆盖率的稳健性,发现该方法在实际应用和有限样本中的适用范围比多元高斯协变量的情况要广泛得多。最后,我们将EigenPrism应用于一个遗传数据集,以估计多个连续表型的遗传信噪比。

相似文献

1
EigenPrism: inference for high dimensional signal-to-noise ratios.特征棱镜:高维信噪比的推断
J R Stat Soc Series B Stat Methodol. 2017 Sep;79(4):1037-1065. doi: 10.1111/rssb.12203. Epub 2016 Sep 16.
2
Asymptotically Normal and Efficient Estimation of Covariate-Adjusted Gaussian Graphical Model.协变量调整高斯图形模型的渐近正态与有效估计
J Am Stat Assoc. 2016 Mar;111(513):394-406. doi: 10.1080/01621459.2015.1010039. Epub 2016 May 5.
3
The Noise Collector for sparse recovery in high dimensions.高维稀疏恢复中的噪声收集器。
Proc Natl Acad Sci U S A. 2020 May 26;117(21):11226-11232. doi: 10.1073/pnas.1913995117. Epub 2020 May 11.
4
Bayesian inference for finite population quantiles from unequal probability samples.基于不等概率样本的有限总体分位数的贝叶斯推断。
Surv Methodol. 2012 Dec;38(2):203-214. Epub 2012 Dec 19.
5
Estimation and Inference for High Dimensional Generalized Linear Models: A Splitting and Smoothing Approach.高维广义线性模型的估计与推断:一种分裂与平滑方法。
J Mach Learn Res. 2021;22.
6
Sparsity estimation from compressive projections via sparse random matrices.通过稀疏随机矩阵从压缩投影中进行稀疏性估计。
EURASIP J Adv Signal Process. 2018;2018(1):56. doi: 10.1186/s13634-018-0578-0. Epub 2018 Sep 10.
7
Beyond Sub-Gaussian Measurements: High-Dimensional Structured Estimation with Sub-Exponential Designs.超越次高斯测量:具有次指数设计的高维结构化估计
Adv Neural Inf Process Syst. 2015 Dec;28:2197-2205.
8
Comparison of asymptotic confidence sets for regression in small samples.小样本回归中渐近置信集的比较
J Biopharm Stat. 2016;26(4):742-57. doi: 10.1080/10543406.2015.1052818. Epub 2015 Jun 22.
9
Enhanced Inference for Finite Population Sampling-Based Prevalence Estimation with Misclassification Errors.基于有限总体抽样且存在错误分类误差情况下患病率估计的增强推断
Am Stat. 2024;78(2):192-198. doi: 10.1080/00031305.2023.2250401. Epub 2023 Sep 21.
10
Direct estimation and correction of bias from temporally variable non-stationary noise in a channelized Hotelling model observer.在通道化霍特林模型观察者中对来自随时间变化的非平稳噪声的偏差进行直接估计和校正。
Phys Med Biol. 2016 Aug 7;61(15):5606-20. doi: 10.1088/0031-9155/61/15/5606. Epub 2016 Jul 6.

引用本文的文献

1
A Regression-based Approach to Robust Estimation and Inference for Genetic Covariance.一种基于回归的遗传协方差稳健估计与推断方法。
J Am Stat Assoc. 2024;119(548):2585-2597. doi: 10.1080/01621459.2023.2261669. Epub 2023 Nov 14.
2
Optimal Estimation of Genetic Relatedness in High-dimensional Linear Models.高维线性模型中遗传相关性的最优估计
J Am Stat Assoc. 2019;114(525):358-369. doi: 10.1080/01621459.2017.1407774. Epub 2018 Nov 19.
3
Inferring the heritability of bacterial traits in the era of machine learning.在机器学习时代推断细菌性状的遗传性。
Bioinform Adv. 2023 Mar 14;3(1):vbad027. doi: 10.1093/bioadv/vbad027. eCollection 2023.
4
Testability of high-dimensional linear models with nonsparse structures.具有非稀疏结构的高维线性模型的可检验性。
Ann Stat. 2022 Apr;50(2):615-639. doi: 10.1214/19-aos1932. Epub 2022 Apr 7.
5
Statistical Methods for Assessing the Explained Variation of a Health Outcome by a Mixture of Exposures.评估健康结局受混合暴露影响的解释变异的统计方法。
Int J Environ Res Public Health. 2022 Feb 25;19(5):2693. doi: 10.3390/ijerph19052693.
6
Powering Research through Innovative Methods for Mixtures in Epidemiology (PRIME) Program: Novel and Expanded Statistical Methods.通过创新方法研究混合物在流行病学中的应用(PRIME)计划:新颖和扩展的统计方法。
Int J Environ Res Public Health. 2022 Jan 26;19(3):1378. doi: 10.3390/ijerph19031378.
7
Boosting heritability: estimating the genetic component of phenotypic variation with multiple sample splitting.提高遗传力:使用多次样本拆分估计表型变异的遗传成分。
BMC Bioinformatics. 2021 Mar 27;22(1):164. doi: 10.1186/s12859-021-04079-7.
8
A Unified and Comprehensible View of Parametric and Kernel Methods for Genomic Prediction with Application to Rice.基因组预测中参数方法和核方法的统一且可理解的观点及其在水稻中的应用
Front Genet. 2016 Aug 9;7:145. doi: 10.3389/fgene.2016.00145. eCollection 2016.
9
Fast and Accurate Construction of Confidence Intervals for Heritability.快速准确地构建遗传力的置信区间
Am J Hum Genet. 2016 Jun 2;98(6):1181-1192. doi: 10.1016/j.ajhg.2016.04.016.

本文引用的文献

1
A SIGNIFICANCE TEST FOR THE LASSO.套索(LASSO)的显著性检验
Ann Stat. 2014 Apr;42(2):413-468. doi: 10.1214/13-AOS1175.
2
An integrated map of genetic variation from 1,092 human genomes.1092 个人类基因组遗传变异的综合图谱。
Nature. 2012 Nov 1;491(7422):56-65. doi: 10.1038/nature11632.
3
Variance estimation using refitted cross-validation in ultrahigh dimensional regression.超高维回归中使用重新拟合交叉验证的方差估计
J R Stat Soc Series B Stat Methodol. 2012 Jan 1;74(1):37-65. doi: 10.1111/j.1467-9868.2011.01005.x.
4
Accurate estimation of heritability in genome wide studies using random effects models.使用随机效应模型在全基因组研究中准确估计遗传力。
Bioinformatics. 2011 Jul 1;27(13):i317-23. doi: 10.1093/bioinformatics/btr219.
5
Common SNPs explain a large proportion of the heritability for human height.常见的单核苷酸多态性解释了人类身高遗传的很大一部分。
Nat Genet. 2010 Jul;42(7):565-9. doi: 10.1038/ng.608. Epub 2010 Jun 20.
6
Variance component model to account for sample structure in genome-wide association studies.用于全基因组关联研究中样本结构的方差成分模型。
Nat Genet. 2010 Apr;42(4):348-54. doi: 10.1038/ng.548. Epub 2010 Mar 7.
7
Finding the missing heritability of complex diseases.寻找复杂疾病中缺失的遗传力。
Nature. 2009 Oct 8;461(7265):747-53. doi: 10.1038/nature08494.
8
Genome-wide association analysis of metabolic traits in a birth cohort from a founder population.对一个奠基者群体出生队列中的代谢性状进行全基因组关联分析。
Nat Genet. 2009 Jan;41(1):35-46. doi: 10.1038/ng.271. Epub 2008 Dec 7.
9
Genome-wide association analysis identifies 20 loci that influence adult height.全基因组关联分析确定了20个影响成人身高的基因座。
Nat Genet. 2008 May;40(5):575-83. doi: 10.1038/ng.121. Epub 2008 Apr 6.
10
Efficient control of population structure in model organism association mapping.模式生物关联作图中群体结构的有效控制
Genetics. 2008 Mar;178(3):1709-23. doi: 10.1534/genetics.107.080101.