Ghosh Debashis
Department of Statistics, Penn State University, University Park, Pennsylvania, USA.
J Biopharm Stat. 2010 Mar;20(2):193-208. doi: 10.1080/10543400903572704.
In high-throughput studies involving genetic data such as from gene expression microarrays, differential expression analysis between two or more experimental conditions has been a very common analytical task. Much of the resulting literature on multiple comparisons has paid relatively little attention to the choice of test statistic. In this article, we focus on the issue of choice of test statistic based on a special pattern of differential expression. The approach here is based on recasting multiple-comparison procedures for assessing outlying expression values. A major complication is that the resulting p values are discrete; some theoretical properties of sequential testing procedures in this context are explored. We propose the use of q value estimation procedures in this setting. Data from a gene expression profiling experiment in prostate cancer are used to illustrate the methodology.
在涉及基因数据(如基因表达微阵列数据)的高通量研究中,两个或多个实验条件之间的差异表达分析一直是一项非常常见的分析任务。关于多重比较的大量相关文献相对较少关注检验统计量的选择。在本文中,我们基于一种特殊的差异表达模式,聚焦于检验统计量的选择问题。这里的方法基于重新构建用于评估异常表达值的多重比较程序。一个主要的复杂情况是由此产生的p值是离散的;本文探讨了在此背景下序贯检验程序的一些理论性质。我们建议在这种情况下使用q值估计程序。来自前列腺癌基因表达谱实验的数据用于说明该方法。