Suppr超能文献

基于排列的多因素微阵列实验多重检验中零统计量的构建。

Construction of null statistics in permutation-based multiple testing for multi-factorial microarray experiments.

作者信息

Gao Xin

机构信息

Department of Mathematics and Statistics, York University 4700 Keele Street, Toronto, ON M3J 1P3, Canada.

出版信息

Bioinformatics. 2006 Jun 15;22(12):1486-94. doi: 10.1093/bioinformatics/btl109. Epub 2006 Mar 30.

Abstract

MOTIVATION

The parametric F-test has been widely used in the analysis of factorial microarray experiments to assess treatment effects. However, the normality assumption is often untenable for microarray experiments with small replications. Therefore, permutation-based methods are called for help to assess the statistical significance. The distribution of the F-statistics across all the genes on the array can be regarded as a mixture distribution with a proportion of statistics generated from the null distribution of no differential gene expression whereas the other proportion of statistics generated from the alternative distribution of genes differentially expressed. This results in the fact that the permutation distribution of the F-statistics may not approximate well to the true null distribution of the F-statistics. Therefore, the construction of a proper null statistic to better approximate the null distribution of F-statistic is of great importance to the permutation-based multiple testing in microarray data analysis.

RESULTS

In this paper, we extend the ideas of constructing null statistics based on pairwise differences to neglect the treatment effects from the two-sample comparison problem to the multifactorial balanced or unbalanced microarray experiments. A null statistic based on a subpartition method is proposed and its distribution is employed to approximate the null distribution of the F-statistic. The proposed null statistic is able to accommodate unbalance in the design and is also corrected for the undue correlation between its numerator and denominator. In the simulation studies and real biological data analysis, the number of true positives and the false discovery rate (FDR) of the proposed null statistic are compared with those of the permutated version of the F-statistic. It has been shown that our proposed method has a better control of the FDRs and a higher power than the standard permutation method to detect differentially expressed genes because of the better approximated tail probabilities.

摘要

动机

参数F检验已广泛用于析因微阵列实验分析以评估处理效应。然而,对于重复次数少的微阵列实验,正态性假设往往难以成立。因此,需要借助基于置换的方法来评估统计显著性。阵列上所有基因的F统计量分布可视为一种混合分布,其中一部分统计量由无差异基因表达的零分布产生,而另一部分统计量由差异表达基因的备择分布产生。这导致F统计量的置换分布可能无法很好地近似F统计量的真实零分布。因此,构建一个合适的零统计量以更好地近似F统计量的零分布对于微阵列数据分析中基于置换的多重检验非常重要。

结果

在本文中,我们将基于成对差异构建零统计量的思想从两样本比较问题扩展到多因素平衡或不平衡微阵列实验,以忽略处理效应。提出了一种基于子划分方法的零统计量,并利用其分布来近似F统计量的零分布。所提出的零统计量能够适应设计中的不平衡,并且还针对其分子和分母之间的不当相关性进行了校正。在模拟研究和实际生物学数据分析中,将所提出的零统计量的真阳性数量和错误发现率(FDR)与F统计量的置换版本的进行了比较。结果表明,由于更好地近似了尾部概率,我们提出的方法比标准置换方法对FDR有更好的控制,并且在检测差异表达基因方面具有更高的功效。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验