Cabras Stefano
Department of Statistics, Universidad Carlos III de Madrid, Spain; Department of Mathematics and Informatics, Università di Cagliari, Italy.
Stat Methods Med Res. 2018 Feb;27(2):364-383. doi: 10.1177/0962280216628903. Epub 2016 Mar 16.
The problem of multiple hypothesis testing can be represented as a Markov process where a new alternative hypothesis is accepted in accordance with its relative evidence to the currently accepted one. This virtual and not formally observed process provides the most probable set of non null hypotheses given the data; it plays the same role as Markov Chain Monte Carlo in approximating a posterior distribution. To apply this representation and obtain the posterior probabilities over all alternative hypotheses, it is enough to have, for each test, barely defined Bayes Factors, e.g. Bayes Factors obtained up to an unknown constant. Such Bayes Factors may either arise from using default and improper priors or from calibrating p-values with respect to their corresponding Bayes Factor lower bound. Both sources of evidence are used to form a Markov transition kernel on the space of hypotheses. The approach leads to easy interpretable results and involves very simple formulas suitable to analyze large datasets as those arising from gene expression data (microarray or RNA-seq experiments).
多重假设检验问题可表示为一个马尔可夫过程,其中新的备择假设根据其相对于当前接受假设的相对证据被接受。这个虚拟且未正式观察到的过程在给定数据的情况下提供了最有可能的非零假设集;它在近似后验分布方面与马尔可夫链蒙特卡罗起着相同的作用。为了应用这种表示并获得所有备择假设的后验概率,对于每个检验,只需定义基本的贝叶斯因子即可,例如,达到未知常数的贝叶斯因子。这种贝叶斯因子可能来自使用默认和不恰当的先验,或者通过相对于其相应的贝叶斯因子下限校准p值。这两种证据来源都用于在假设空间上形成马尔可夫转移核。该方法产生易于解释的结果,并且涉及非常简单的公式,适用于分析来自基因表达数据(微阵列或RNA测序实验)的大型数据集。