Suppr超能文献

用于评估微阵列实验中差异表达的线性模型和经验贝叶斯方法。

Linear models and empirical bayes methods for assessing differential expression in microarray experiments.

作者信息

Smyth Gordon K

机构信息

Walter and Eliza Hall Institute.

出版信息

Stat Appl Genet Mol Biol. 2004;3:Article3. doi: 10.2202/1544-6115.1027. Epub 2004 Feb 12.

Abstract

The problem of identifying differentially expressed genes in designed microarray experiments is considered. Lonnstedt and Speed (2002) derived an expression for the posterior odds of differential expression in a replicated two-color experiment using a simple hierarchical parametric model. The purpose of this paper is to develop the hierarchical model of Lonnstedt and Speed (2002) into a practical approach for general microarray experiments with arbitrary numbers of treatments and RNA samples. The model is reset in the context of general linear models with arbitrary coefficients and contrasts of interest. The approach applies equally well to both single channel and two color microarray experiments. Consistent, closed form estimators are derived for the hyperparameters in the model. The estimators proposed have robust behavior even for small numbers of arrays and allow for incomplete data arising from spot filtering or spot quality weights. The posterior odds statistic is reformulated in terms of a moderated t-statistic in which posterior residual standard deviations are used in place of ordinary standard deviations. The empirical Bayes approach is equivalent to shrinkage of the estimated sample variances towards a pooled estimate, resulting in far more stable inference when the number of arrays is small. The use of moderated t-statistics has the advantage over the posterior odds that the number of hyperparameters which need to estimated is reduced; in particular, knowledge of the non-null prior for the fold changes are not required. The moderated t-statistic is shown to follow a t-distribution with augmented degrees of freedom. The moderated t inferential approach extends to accommodate tests of composite null hypotheses through the use of moderated F-statistics. The performance of the methods is demonstrated in a simulation study. Results are presented for two publicly available data sets.

摘要

本文考虑了在设计的微阵列实验中识别差异表达基因的问题。Lonnstedt和Speed(2002)使用简单的层次参数模型,推导出了复制双色实验中差异表达后验优势比的表达式。本文的目的是将Lonnstedt和Speed(2002)的层次模型发展为一种适用于具有任意数量处理和RNA样本的一般微阵列实验的实用方法。该模型在具有任意系数和感兴趣对比的一般线性模型的背景下重新设定。该方法同样适用于单通道和双色微阵列实验。为模型中的超参数推导了一致的闭式估计量。所提出的估计量即使对于少量阵列也具有稳健的行为,并允许处理因斑点过滤或斑点质量权重而产生的不完整数据。后验优势比统计量根据适度t统计量重新表述,其中后验残差标准差用于代替普通标准差。经验贝叶斯方法相当于将估计的样本方差向合并估计值收缩,当阵列数量较少时,可产生更稳定的推断。使用适度t统计量相对于后验优势比的优势在于,需要估计的超参数数量减少;特别是,不需要折叠变化的非零先验知识。结果表明,适度t统计量服从具有增广自由度的t分布。适度t推断方法通过使用适度F统计量扩展到适应复合零假设的检验。在模拟研究中展示了这些方法的性能。给出了两个公开可用数据集的结果。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验