在重复微阵列实验中检测差异表达基因的改良非参数方法。

Modified nonparametric approaches to detecting differentially expressed genes in replicated microarray experiments.

作者信息

Zhao Yanli, Pan Wei

机构信息

Division of Biostatistics, School of Public Health, University of Minnesota, MMC 303, A460 Mayo Building, 420 Delaware Street SE, Minneapolis, MN 55455, USA.

出版信息

Bioinformatics. 2003 Jun 12;19(9):1046-54. doi: 10.1093/bioinformatics/btf879.

DOI:10.1093/bioinformatics/btf879

PMID:12801864

Abstract

MOTIVATION

An important goal in analyzing microarray data is to determine which genes are differentially expressed across two kinds of tissue samples or samples obtained under two experimental conditions. Various parametric tests, such as the two-sample t-test, have been used, but their possibly too strong parametric assumptions or large sample justifications may not hold in practice. As alternatives, a class of three nonparametric statistical methods, including the empirical Bayes method of Efron et al. (2001), the significance analysis of microarray (SAM) method of Tusher et al. (2001) and the mixture model method (MMM) of Pan et al. (2001), have been proposed. All the three methods depend on constructing a test statistic and a so-called null statistic such that the null statistic's distribution can be used to approximate the null distribution of the test statistic. However, relatively little effort has been directed toward assessment of the performance or the underlying assumptions of the methods in constructing such test and null statistics.

RESULTS

We point out a problem of a current method to construct the test and null statistics, which may lead to largely inflated Type I errors (i.e. false positives). We also propose two modifications that overcome the problem. In the context of MMM, the improved performance of the modified methods is demonstrated using simulated data. In addition, our numerical results also provide evidence to support the utility and effectiveness of MMM.

摘要

动机

分析微阵列数据的一个重要目标是确定哪些基因在两种组织样本或在两种实验条件下获得的样本之间存在差异表达。已经使用了各种参数检验，例如双样本t检验，但其可能过于严格的参数假设或大样本条件在实际中可能不成立。作为替代方法，已经提出了一类三种非参数统计方法，包括Efron等人（2001年）的经验贝叶斯方法、Tusher等人（2001年）的微阵列显著性分析（SAM）方法以及Pan等人（2001年）的混合模型方法（MMM）。这三种方法都依赖于构建一个检验统计量和一个所谓的零统计量，使得零统计量的分布可用于近似检验统计量的零分布。然而，在构建此类检验统计量和零统计量时，针对这些方法的性能评估或潜在假设的研究相对较少。

结果

我们指出了当前构建检验统计量和零统计量方法存在的一个问题，该问题可能导致第一类错误（即假阳性）大幅增加。我们还提出了两种改进方法来克服这个问题。在MMM的背景下，使用模拟数据展示了改进方法的性能提升。此外，我们的数值结果也为支持MMM的实用性和有效性提供了证据。

相似文献

Modified nonparametric approaches to detecting differentially expressed genes in replicated microarray experiments.在重复微阵列实验中检测差异表达基因的改良非参数方法。

Bioinformatics. 2003 Jun 12;19(9):1046-54. doi: 10.1093/bioinformatics/btf879.

On the use of permutation in and the performance of a class of nonparametric methods to detect differential gene expression.关于排列在一类用于检测差异基因表达的非参数方法中的应用及其性能。

Bioinformatics. 2003 Jul 22;19(11):1333-40. doi: 10.1093/bioinformatics/btg167.

A comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments.在重复微阵列实验中发现差异表达基因的统计方法的比较综述。

Bioinformatics. 2002 Apr;18(4):546-54. doi: 10.1093/bioinformatics/18.4.546.

The t-mixture model approach for detecting differentially expressed genes in microarrays.用于检测微阵列中差异表达基因的t混合模型方法。

Funct Integr Genomics. 2008 Aug;8(3):181-6. doi: 10.1007/s10142-007-0071-6. Epub 2008 Jan 22.

Detecting differentially expressed genes by relative entropy.通过相对熵检测差异表达基因。

J Theor Biol. 2005 Jun 7;234(3):395-402. doi: 10.1016/j.jtbi.2004.11.039. Epub 2005 Jan 24.

Using weighted permutation scores to detect differential gene expression with microarray data.使用加权排列分数通过微阵列数据检测差异基因表达。

J Bioinform Comput Biol. 2005 Aug;3(4):989-1006. doi: 10.1142/s021972000500134x.

Construction of null statistics in permutation-based multiple testing for multi-factorial microarray experiments.基于排列的多因素微阵列实验多重检验中零统计量的构建。

Bioinformatics. 2006 Jun 15;22(12):1486-94. doi: 10.1093/bioinformatics/btl109. Epub 2006 Mar 30.

A spline function approach for detecting differentially expressed genes in microarray data analysis.一种用于微阵列数据分析中检测差异表达基因的样条函数方法。

Bioinformatics. 2004 Nov 22;20(17):2954-63. doi: 10.1093/bioinformatics/bth339. Epub 2004 Jun 4.

Noise sampling method: an ANOVA approach allowing robust selection of differentially regulated genes measured by DNA microarrays.噪声采样方法：一种方差分析方法，可用于通过DNA微阵列测量的差异调节基因的稳健选择。

Bioinformatics. 2003 Jul 22;19(11):1348-59. doi: 10.1093/bioinformatics/btg165.

An improved nonparametric approach for detecting differentially expressed genes with replicated microarray data.一种用于利用重复微阵列数据检测差异表达基因的改进非参数方法。

Stat Appl Genet Mol Biol. 2006;5:Article30. doi: 10.2202/1544-6115.1246. Epub 2007 Jan 2.

引用本文的文献

Density distribution of gene expression profiles and evaluation of using maximal information coefficient to identify differentially expressed genes.基因表达谱密度分布及最大信息系数在差异表达基因识别中的应用评价。

PLoS One. 2019 Jul 17;14(7):e0219551. doi: 10.1371/journal.pone.0219551. eCollection 2019.

Maximal information coefficient applied to differentially expressed genes identification: A feasibility study.应用最大信息系数进行差异表达基因鉴定：一项可行性研究。

Technol Health Care. 2019;27(S1):249-262. doi: 10.3233/THC-199024.

An improved analysis methodology for translational profiling by microarray.一种用于微阵列翻译谱分析的改进分析方法。

RNA. 2017 Nov;23(11):1601-1613. doi: 10.1261/rna.060525.116. Epub 2017 Aug 25.

f-divergence cutoff index to simultaneously identify differential expression in the integrated transcriptome and proteome.f散度截止指数，用于同时识别整合转录组和蛋白质组中的差异表达。

Nucleic Acids Res. 2016 Jun 2;44(10):e97. doi: 10.1093/nar/gkw157. Epub 2016 Mar 14.

Biological assessment of robust noise models in microarray data analysis.生物评估稳健噪声模型在微阵列数据分析中的应用。

Bioinformatics. 2011 Mar 15;27(6):807-14. doi: 10.1093/bioinformatics/btr018. Epub 2011 Jan 19.

A new test statistic based on shrunken sample variance for identifying differentially expressed genes in small microarray experiments.一种基于收缩样本方差的新检验统计量，用于在小型微阵列实验中识别差异表达基因。

Bioinform Biol Insights. 2008 Feb 29;2:145-56. doi: 10.4137/bbi.s473.

Integrating multiple microarray data for cancer pathway analysis using bootstrapping K-S test.使用自展K-S检验整合多个微阵列数据用于癌症通路分析。

J Biomed Biotechnol. 2009;2009:707580. doi: 10.1155/2009/707580. Epub 2009 May 26.

Estimating the false discovery rate using mixed normal distribution for identifying differentially expressed genes in microarray data analysis.在微阵列数据分析中使用混合正态分布估计错误发现率以识别差异表达基因。

Cancer Inform. 2008 Jan 22;3:140-8.

Analyzing microarray data with transitive directed acyclic graphs.使用传递性有向无环图分析微阵列数据。

J Bioinform Comput Biol. 2009 Feb;7(1):135-56. doi: 10.1142/s0219720009003972.

ArraySolver: an algorithm for colour-coded graphical display and Wilcoxon signed-rank statistics for comparing microarray gene expression data.阵列求解器：一种用于彩色编码图形显示和威尔科克森符号秩统计的算法，用于比较微阵列基因表达数据。

Comp Funct Genomics. 2004;5(1):39-47. doi: 10.1002/cfg.369.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

在重复微阵列实验中检测差异表达基因的改良非参数方法。

Modified nonparametric approaches to detecting differentially expressed genes in replicated microarray experiments.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

动机

结果

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献