Suppr超能文献

分层逆高斯模型与多重检验:在基因表达数据中的应用

Hierarchical inverse Gaussian models and multiple testing: application to gene expression data.

作者信息

Labbe Aurelie, Thompson Mary

机构信息

Universite Laval.

出版信息

Stat Appl Genet Mol Biol. 2005;4:Article23. doi: 10.2202/1544-6115.1151. Epub 2005 Sep 6.

Abstract

Detecting differentially expressed genes in microarray experiments is a topic that has been well studied in the literature. Many hypothesis testing methods have been proposed that rely on strong distributional assumptions for the gene intensities. However, the shape of microarray data may vary substantially from one experiment to another, and model assumptions may be seriously violated in many cases. The literature on microarray data is mainly based on two distributions: the log-normal and the gamma distributions, that often appear to be effective when used in a Bayesian hierarchical framework. However, if a model that fits the data well in a global manner seems attractive, two points should be regarded with attention: the ability of the model to fit the tail of the observed distribution, and its robustness to a wrong specification of the model, in terms of error rates for the hypothesis tests. In order to focus on these aspects, we propose to use Bayesian models involving the inverse Gaussian distribution to describe gene expression data. We show that these models can be good competitors to the traditional Bayesian or random effect gamma or log-normal models in some situations. A multiple testing procedure is then proposed, based on an asymptotic property of the posterior probability of the one-sided alternative hypothesis. We show that the asymptotic property is well approximated for inverse Gaussian models, even when the number of observations available for each test is very small.

摘要

在微阵列实验中检测差异表达基因是文献中已得到充分研究的一个主题。已经提出了许多假设检验方法,这些方法依赖于对基因强度的强分布假设。然而,微阵列数据的形状在不同实验之间可能有很大差异,并且在许多情况下模型假设可能会被严重违反。关于微阵列数据的文献主要基于两种分布:对数正态分布和伽马分布,当在贝叶斯层次框架中使用时,它们通常看起来是有效的。然而,如果一个能整体很好拟合数据的模型似乎很有吸引力,那么有两点需要注意:模型拟合观察到的分布尾部的能力,以及在假设检验的错误率方面,其对模型错误设定的稳健性。为了关注这些方面,我们建议使用涉及逆高斯分布的贝叶斯模型来描述基因表达数据。我们表明,在某些情况下,这些模型可以成为传统贝叶斯模型或随机效应伽马模型或对数正态模型的有力竞争对手。然后基于单侧备择假设后验概率的渐近性质提出了一种多重检验程序。我们表明,即使每个检验可用的观测数量非常少,逆高斯模型的渐近性质也能得到很好的近似。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验