用于选定总体均值的精确同步置信区间及其在微阵列数据分析中的应用。

Sharp simultaneous confidence intervals for the means of selected populations with application to microarray data analysis.

作者信息

Qiu Jing, Hwang J T Gene

机构信息

Department of Statistics, University of Missouri-Columbia Columbia, Missouri 65211, USA.

出版信息

Biometrics. 2007 Sep;63(3):767-76. doi: 10.1111/j.1541-0420.2007.00770.x. Epub 2007 Apr 2.

DOI:10.1111/j.1541-0420.2007.00770.x

PMID:17403105

Abstract

Simultaneous inference for a large number, N, of parameters is a challenge. In some situations, such as microarray experiments, researchers are only interested in making inference for the K parameters corresponding to the K most extreme estimates. Hence it seems important to construct simultaneous confidence intervals for these K parameters. The naïve simultaneous confidence intervals for the K means (applied directly without taking into account the selection) have low coverage probabilities. We take an empirical Bayes approach (or an approach based on the random effect model) to construct simultaneous confidence intervals with good coverage probabilities. For N = 10,000 and K = 100, typical for microarray data, our confidence intervals could be 77% shorter than the naïve K-dimensional simultaneous intervals.

摘要

对大量（N个）参数进行同时推断是一项挑战。在某些情况下，比如微阵列实验，研究人员只对与K个最极端估计值对应的K个参数进行推断。因此，为这K个参数构建同时置信区间似乎很重要。对这K个均值的朴素同时置信区间（直接应用而不考虑选择）具有较低的覆盖概率。我们采用经验贝叶斯方法（或基于随机效应模型的方法）来构建具有良好覆盖概率的同时置信区间。对于微阵列数据典型的N = 10000和K = 100的情况，我们的置信区间可能比朴素的K维同时区间短77%。