Zoh Roger S, Sarkar Abhra, Carroll Raymond J, Mallick Bani K
Department of Epidemiology & Biostatistics, Texas A&M University, 1266 TAMU, College Station, TX 77843-1266, USA.
Department of Statistical Science, Duke University, Box 90251, Durham NC 27708-0251, USA.
J Am Stat Assoc. 2018;113(524):1733-1741. doi: 10.1080/01621459.2017.1371024. Epub 2018 Aug 6.
We develop a Bayes factor based testing procedure for comparing two population means in high dimensional settings. In 'large-p-small-n' settings, Bayes factors based on proper priors require eliciting a large and complex × covariance matrix, whereas Bayes factors based on Jeffrey's prior suffer the same impediment as the classical Hotelling test statistic as they involve inversion of ill-formed sample covariance matrices. To circumvent this limitation, we propose that the Bayes factor be based on lower dimensional random projections of the high dimensional data vectors. We choose the prior under the alternative to maximize the power of the test for a fixed threshold level, yielding a restricted most powerful Bayesian test (RMPBT). The final test statistic is based on the ensemble of Bayes factors corresponding to multiple replications of randomly projected data. We show that the test is unbiased and, under mild conditions, is also locally consistent. We demonstrate the efficacy of the approach through simulated and real data examples.
我们开发了一种基于贝叶斯因子的检验程序,用于在高维情形下比较两个总体均值。在“大p小n”情形中,基于恰当先验的贝叶斯因子需要引出一个大且复杂的协方差矩阵,而基于杰弗里先验的贝叶斯因子与经典霍特林检验统计量存在同样的障碍,因为它们涉及病态样本协方差矩阵的求逆。为规避这一限制,我们提议贝叶斯因子基于高维数据向量的低维随机投影。我们在备择假设下选择先验,以在固定阈值水平下最大化检验功效,从而得到一个受限的最强大贝叶斯检验(RMPBT)。最终的检验统计量基于与随机投影数据的多次重复相对应的贝叶斯因子集合。我们表明该检验是无偏的,并且在温和条件下也是局部一致的。我们通过模拟和真实数据示例证明了该方法的有效性。