高维问题中的秩条件覆盖率与置信区间

Rank Conditional Coverage and Confidence Intervals in High-Dimensional Problems.

作者信息

Morrison Jean, Simon Noah

机构信息

Department of Human Gentetics, University of Chicago, Chicago, IL.

Department of Biostatistics, University of Washington, Seattle, WA.

出版信息

J Comput Graph Stat. 2018;27(3):648-656. doi: 10.1080/10618600.2017.1411270. Epub 2018 Jun 14.

DOI:10.1080/10618600.2017.1411270

PMID:30740009

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6364309/

Abstract

Confidence interval procedures used in low dimensional settings are often inappropriate for high dimensional applications. When many parameters are estimated, marginal confidence intervals associated with the most significant estimates have very low coverage rates: They are too small and centered at biased estimates. The problem of forming confidence intervals in high dimensional settings has previously been studied through the lens of selection adjustment. In that framework, the goal is to control the proportion of non-covering intervals formed for selected parameters. In this paper we approach the problem by considering the relationship between rank and coverage probability. Marginal confidence intervals have very low coverage rates for the most significant parameters and high rates for parameters with more boring estimates. Many selection adjusted intervals have the same behavior despite controlling the coverage rate within a selected set. This relationship between rank and coverage rate means that the parameters most likely to be pursued further in follow-up or replication studies are the least likely to be covered by the constructed intervals. In this paper, we propose rank conditional coverage (RCC) as a new coverage criterion for confidence intervals in multiple testing/covering problems. The RCC is the expected coverage rate of an interval given the significance ranking for the associated estimator. We also propose two methods that use bootstrapping to construct confidence intervals that control the RCC. Because these methods make use of additional information captured by the ranks of the parameter estimates, they often produce smaller intervals than marginal or selection adjusted methods. These methods are implemented in R (R Core Team, 2017) in the package rcc available on CRAN at https://cran.r-project.org/web/packages/rcc/index.html.

摘要

低维情形下使用的置信区间程序通常不适用于高维应用。当估计许多参数时，与最显著估计相关的边际置信区间的覆盖率非常低：它们太小且以有偏估计为中心。此前已通过选择调整的视角研究过高维情形下构建置信区间的问题。在该框架下，目标是控制为所选参数形成的未覆盖区间的比例。在本文中，我们通过考虑秩与覆盖概率之间的关系来处理这个问题。边际置信区间对于最显著的参数覆盖率非常低，而对于估计较不显著的参数覆盖率较高。许多经过选择调整的区间尽管在所选集合内控制了覆盖率，但仍有相同的表现。这种秩与覆盖率之间的关系意味着，在后续或重复研究中最有可能被进一步探究的参数，最不可能被构建的区间所覆盖。在本文中，我们提出秩条件覆盖（RCC）作为多重检验/覆盖问题中置信区间的一种新的覆盖标准。RCC是给定相关估计量的显著性排序时区间的期望覆盖率。我们还提出了两种使用自助法构建控制RCC的置信区间的方法。由于这些方法利用了参数估计秩所捕获的额外信息，它们通常会产生比边际或选择调整方法更小的区间。这些方法在R（R核心团队，2017）中通过CRAN上https://cran.r-project.org/web/packages/rcc/index.html的rcc包实现。

相似文献

Rank Conditional Coverage and Confidence Intervals in High-Dimensional Problems.

J Comput Graph Stat. 2018;27(3):648-656. doi: 10.1080/10618600.2017.1411270. Epub 2018 Jun 14.

Selection-adjusted inference: an application to confidence intervals for cis-eQTL effect sizes.

Biostatistics. 2021 Jan 28;22(1):181-197. doi: 10.1093/biostatistics/kxz024.

Applications of Monte Carlo Simulation in Modelling of Biochemical Processes

Response to letter to the editor from Dr Rahman Shiri: The challenging topic of suicide across occupational groups.

Scand J Work Environ Health. 2018 Jan 1;44(1):108-110. doi: 10.5271/sjweh.3698. Epub 2017 Dec 8.

Robust estimation of the expected survival probabilities from high-dimensional Cox models with biomarker-by-treatment interactions in randomized clinical trials.

BMC Med Res Methodol. 2017 May 22;17(1):83. doi: 10.1186/s12874-017-0354-0.

Empirical Bayes interval estimates that are conditionally equal to unadjusted confidence intervals or to default prior credibility intervals.

Stat Appl Genet Mol Biol. 2012 Feb 21;11(3):Article 7. doi: 10.1515/1544-6115.1765.

The empirical coverage of confidence intervals: point estimates and confidence intervals for confidence levels.

Biom J. 2012 Jul;54(4):537-51. doi: 10.1002/bimj.201100134. Epub 2012 May 23.

The projack: a resampling approach to correct for ranking bias in high-throughput studies.

Biostatistics. 2016 Jan;17(1):54-64. doi: 10.1093/biostatistics/kxv022. Epub 2015 Jun 3.

Confidence intervals construction for difference of two means with incomplete correlated data.

BMC Med Res Methodol. 2016 Mar 11;16:31. doi: 10.1186/s12874-016-0125-3.

A comparison of confidence interval methods for the intraclass correlation coefficient in community-based cluster randomization trials with a binary outcome.

Clin Trials. 2016 Apr;13(2):180-7. doi: 10.1177/1740774515606377. Epub 2015 Sep 28.

本文引用的文献

False discovery rates: a new deal.

Biostatistics. 2017 Apr 1;18(2):275-294. doi: 10.1093/biostatistics/kxw041.

Tweedie's Formula and Selection Bias.

J Am Stat Assoc. 2011;106(496):1602-1614. doi: 10.1198/jasa.2011.tm11181. Epub 2012 Jan 24.

Bias-reduced estimators and confidence intervals for odds ratios in genome-wide association studies.

Biostatistics. 2008 Oct;9(4):621-34. doi: 10.1093/biostatistics/kxn001. Epub 2008 Feb 28.

Reduction of selection bias in genomewide studies by resampling.

Genet Epidemiol. 2005 May;28(4):352-67. doi: 10.1002/gepi.20068.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

高维问题中的秩条件覆盖率与置信区间

Rank Conditional Coverage and Confidence Intervals in High-Dimensional Problems.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献