分层逆高斯模型与多重检验：在基因表达数据中的应用

Hierarchical inverse Gaussian models and multiple testing: application to gene expression data.

作者信息

Labbe Aurelie, Thompson Mary

机构信息

Universite Laval.

出版信息

Stat Appl Genet Mol Biol. 2005;4:Article23. doi: 10.2202/1544-6115.1151. Epub 2005 Sep 6.

DOI:10.2202/1544-6115.1151

PMID:16646841

Abstract

Detecting differentially expressed genes in microarray experiments is a topic that has been well studied in the literature. Many hypothesis testing methods have been proposed that rely on strong distributional assumptions for the gene intensities. However, the shape of microarray data may vary substantially from one experiment to another, and model assumptions may be seriously violated in many cases. The literature on microarray data is mainly based on two distributions: the log-normal and the gamma distributions, that often appear to be effective when used in a Bayesian hierarchical framework. However, if a model that fits the data well in a global manner seems attractive, two points should be regarded with attention: the ability of the model to fit the tail of the observed distribution, and its robustness to a wrong specification of the model, in terms of error rates for the hypothesis tests. In order to focus on these aspects, we propose to use Bayesian models involving the inverse Gaussian distribution to describe gene expression data. We show that these models can be good competitors to the traditional Bayesian or random effect gamma or log-normal models in some situations. A multiple testing procedure is then proposed, based on an asymptotic property of the posterior probability of the one-sided alternative hypothesis. We show that the asymptotic property is well approximated for inverse Gaussian models, even when the number of observations available for each test is very small.

摘要

在微阵列实验中检测差异表达基因是文献中已得到充分研究的一个主题。已经提出了许多假设检验方法，这些方法依赖于对基因强度的强分布假设。然而，微阵列数据的形状在不同实验之间可能有很大差异，并且在许多情况下模型假设可能会被严重违反。关于微阵列数据的文献主要基于两种分布：对数正态分布和伽马分布，当在贝叶斯层次框架中使用时，它们通常看起来是有效的。然而，如果一个能整体很好拟合数据的模型似乎很有吸引力，那么有两点需要注意：模型拟合观察到的分布尾部的能力，以及在假设检验的错误率方面，其对模型错误设定的稳健性。为了关注这些方面，我们建议使用涉及逆高斯分布的贝叶斯模型来描述基因表达数据。我们表明，在某些情况下，这些模型可以成为传统贝叶斯模型或随机效应伽马模型或对数正态模型的有力竞争对手。然后基于单侧备择假设后验概率的渐近性质提出了一种多重检验程序。我们表明，即使每个检验可用的观测数量非常少，逆高斯模型的渐近性质也能得到很好的近似。

相似文献

Hierarchical inverse Gaussian models and multiple testing: application to gene expression data.分层逆高斯模型与多重检验：在基因表达数据中的应用

Stat Appl Genet Mol Biol. 2005;4:Article23. doi: 10.2202/1544-6115.1151. Epub 2005 Sep 6.

Powers of multiple-testing procedures for identification of genes significantly differentially expressed in microarray experiments.用于识别在微阵列实验中显著差异表达基因的多重检验程序的功效。

Yi Chuan Xue Bao. 2006 Dec;33(12):1132-40. doi: 10.1016/S0379-4172(06)60152-2.

A new efficient statistical test for detecting variability in the gene expression data.一种用于检测基因表达数据变异性的新型高效统计检验方法。

Stat Methods Med Res. 2008 Aug;17(4):405-19. doi: 10.1177/0962280206078643. Epub 2007 Aug 14.

Segmentation and intensity estimation of microarray images using a gamma-t mixture model.使用伽马-t混合模型对微阵列图像进行分割和强度估计。

Bioinformatics. 2007 Feb 15;23(4):458-65. doi: 10.1093/bioinformatics/btl630. Epub 2006 Dec 12.

Variance stabilization and normalization for one-color microarray data using a data-driven multiscale approach.使用数据驱动的多尺度方法对单色微阵列数据进行方差稳定化和归一化处理。

Bioinformatics. 2006 Oct 15;22(20):2547-53. doi: 10.1093/bioinformatics/btl412. Epub 2006 Jul 28.

Quantile-function based null distribution in resampling based multiple testing.基于重采样的多重检验中基于分位数函数的零分布。

Stat Appl Genet Mol Biol. 2006;5:Article14. doi: 10.2202/1544-6115.1199. Epub 2006 May 21.

Empirical bayes estimation of a sparse vector of gene expression changes.

Stat Appl Genet Mol Biol. 2005;4:Article22. doi: 10.2202/1544-6115.1132. Epub 2005 Sep 6.

Tail posterior probability for inference in pairwise and multiclass gene expression data.用于成对和多类基因表达数据推断的尾部后验概率。

Biometrics. 2007 Dec;63(4):1117-25. doi: 10.1111/j.1541-0420.2007.00807.x.

A Bayesian approach to the multiplicity problem for significance testing with binomial data.

Biometrics. 1987 Jun;43(2):301-11.

A Bayesian approach to estimation and testing in time-course microarray experiments.一种用于时间进程微阵列实验中估计和检验的贝叶斯方法。

Stat Appl Genet Mol Biol. 2007;6:Article24. doi: 10.2202/1544-6115.1299. Epub 2007 Sep 16.

引用本文的文献

Transcriptional profiling implicates novel interactions between abiotic stress and hormonal responses in Thellungiella, a close relative of Arabidopsis.转录谱分析揭示了拟南芥的近缘种盐芥中非生物胁迫与激素反应之间的新相互作用。

Plant Physiol. 2006 Apr;140(4):1437-50. doi: 10.1104/pp.105.070508. Epub 2006 Feb 24.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

分层逆高斯模型与多重检验：在基因表达数据中的应用

Hierarchical inverse Gaussian models and multiple testing: application to gene expression data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献