Division of Biostatistics and Bioinformatics, Penn State University, Hershey, PA 17033, USA.
Biostatistics. 2014 Jan;15(1):60-73. doi: 10.1093/biostatistics/kxt026. Epub 2013 Aug 8.
Empirical Bayes methods have been extensively used for microarray data analysis by modeling the large number of unknown parameters as random effects. Empirical Bayes allows borrowing information across genes and can automatically adjust for multiple testing and selection bias. However, the standard empirical Bayes model can perform poorly if the assumed working prior deviates from the true prior. This paper proposes a new rank-conditioned inference in which the shrinkage and confidence intervals are based on the distribution of the error conditioned on rank of the data. Our approach is in contrast to a Bayesian posterior, which conditions on the data themselves. The new method is almost as efficient as standard Bayesian methods when the working prior is close to the true prior, and it is much more robust when the working prior is not close. In addition, it allows a more accurate (but also more complex) non-parametric estimate of the prior to be easily incorporated, resulting in improved inference. The new method's prior robustness is demonstrated via simulation experiments. Application to a breast cancer gene expression microarray dataset is presented. Our R package rank.Shrinkage provides a ready-to-use implementation of the proposed methodology.
经验贝叶斯方法已被广泛应用于微阵列数据分析,通过将大量未知参数建模为随机效应。经验贝叶斯允许在基因之间借用信息,并可以自动调整多重检验和选择偏差。然而,如果假设的工作先验与真实先验偏离,标准的经验贝叶斯模型可能表现不佳。本文提出了一种新的基于秩的推断方法,其中收缩和置信区间基于数据秩条件下误差的分布。我们的方法与基于数据本身的贝叶斯后验相反。当工作先验接近真实先验时,新方法几乎与标准贝叶斯方法一样有效,而当工作先验不接近时,它则更加稳健。此外,它允许更容易地合并更准确(但也更复杂)的非参数先验估计,从而改善推断。通过模拟实验证明了新方法的先验稳健性。应用于乳腺癌基因表达微阵列数据集。我们的 R 包 rank.Shrinkage 提供了所提出方法的即用实现。