在分析微阵列数据时，我们能多精确地控制错误发现率？

How accurately can we control the FDR in analyzing microarray data?

作者信息

Jung Sin-Ho, Jang Woncheol

机构信息

Department of Biostatistics and Bioinformatics, Duke University, NC 27710, USA.

出版信息

Bioinformatics. 2006 Jul 15;22(14):1730-6. doi: 10.1093/bioinformatics/btl161. Epub 2006 Apr 27.

DOI:10.1093/bioinformatics/btl161

PMID:16644791

Abstract

We want to evaluate the performance of two FDR-based multiple testing procedures by Benjamini and Hochberg (1995, J. R. Stat. Soc. Ser. B, 57, 289-300) and Storey (2002, J. R. Stat. Soc. Ser. B, 64, 479-498) in analyzing real microarray data. These procedures commonly require independence or weak dependence of the test statistics. However, expression levels of different genes from each array are usually correlated due to coexpressing genes and various sources of errors from experiment-specific and subject-specific conditions that are not adjusted for in data analysis. Because of high dimensionality of microarray data, it is usually impossible to check whether the weak dependence condition is met for a given dataset or not. We propose to generate a large number of test statistics from a simulation model which has asymptotically (in terms of the number of arrays) the same correlation structure as the test statistics that will be calculated from the given data and to investigate how accurately the FDR-based testing procedures control the FDR on the simulated data. Our approach is to directly check the performance of these procedures for a given dataset, rather than to check the weak dependency requirement. We illustrate the proposed method with real microarray datasets, one where the clinical endpoint is disease group and another where it is survival.

摘要

我们希望评估由本雅明尼和霍赫贝格（1995年，《皇家统计学会会刊》B辑，第57卷，第289 - 300页）以及斯托里（2002年，《皇家统计学会会刊》B辑，第64卷，第479 - 498页）提出的两种基于错误发现率（FDR）的多重检验程序在分析实际微阵列数据时的性能。这些程序通常要求检验统计量具有独立性或弱相关性。然而，由于共表达基因以及在数据分析中未针对特定实验条件和特定受试者条件进行调整的各种误差来源，每个阵列中不同基因的表达水平通常是相关的。由于微阵列数据的高维度性，通常无法检查给定数据集是否满足弱相关性条件。我们建议从一个模拟模型生成大量检验统计量，该模拟模型在渐近意义上（就阵列数量而言）与将从给定数据计算出的检验统计量具有相同的相关结构，并研究基于FDR的检验程序在模拟数据上对FDR的控制精度。我们的方法是直接检查这些程序在给定数据集上的性能，而不是检查弱相关性要求。我们用实际微阵列数据集说明了所提出的方法，一个数据集的临床终点是疾病组，另一个数据集的临床终点是生存期。

相似文献

How accurately can we control the FDR in analyzing microarray data?在分析微阵列数据时，我们能多精确地控制错误发现率？

Bioinformatics. 2006 Jul 15;22(14):1730-6. doi: 10.1093/bioinformatics/btl161. Epub 2006 Apr 27.

A note on using permutation-based false discovery rate estimates to compare different analysis methods for microarray data.关于使用基于排列的错误发现率估计来比较微阵列数据不同分析方法的说明。

Bioinformatics. 2005 Dec 1;21(23):4280-8. doi: 10.1093/bioinformatics/bti685. Epub 2005 Sep 27.

Multidimensional local false discovery rate for microarray studies.微阵列研究的多维局部错误发现率

Bioinformatics. 2006 Mar 1;22(5):556-65. doi: 10.1093/bioinformatics/btk013. Epub 2005 Dec 20.

Estimation of false discovery proportion under general dependence.一般相关性下错误发现比例的估计

Bioinformatics. 2006 Dec 15;22(24):3025-31. doi: 10.1093/bioinformatics/btl527. Epub 2006 Oct 17.

Comparison of seven methods for producing Affymetrix expression scores based on False Discovery Rates in disease profiling data.基于疾病谱数据中错误发现率的七种生成Affymetrix表达分数方法的比较。

BMC Bioinformatics. 2005 Feb 10;6:26. doi: 10.1186/1471-2105-6-26.

Practical FDR-based sample size calculations in microarray experiments.微阵列实验中基于实际错误发现率的样本量计算

Bioinformatics. 2005 Aug 1;21(15):3264-72. doi: 10.1093/bioinformatics/bti519. Epub 2005 Jun 2.

Sample size for FDR-control in microarray data analysis.微阵列数据分析中用于错误发现率控制的样本量。

Bioinformatics. 2005 Jul 15;21(14):3097-104. doi: 10.1093/bioinformatics/bti456. Epub 2005 Apr 21.

A new outlier removal approach for cDNA microarray normalization.一种用于cDNA微阵列标准化的新离群值去除方法。

Biotechniques. 2009 Aug;47(2):691-2, 694-700. doi: 10.2144/000113195.

Construction of null statistics in permutation-based multiple testing for multi-factorial microarray experiments.基于排列的多因素微阵列实验多重检验中零统计量的构建。

Bioinformatics. 2006 Jun 15;22(12):1486-94. doi: 10.1093/bioinformatics/btl109. Epub 2006 Mar 30.

Combining multiple microarrays in the presence of controlling variables.在存在控制变量的情况下合并多个微阵列。

Bioinformatics. 2006 Jul 15;22(14):1682-9. doi: 10.1093/bioinformatics/btl183. Epub 2006 May 16.

引用本文的文献

Integrated analysis of microRNA and mRNA expression profiles in rats with selenium deficiency and identification of associated miRNA-mRNA network.硒缺乏症大鼠的 microRNA 和 mRNA 表达谱的综合分析及相关 miRNA-mRNA 网络的鉴定。

Sci Rep. 2018 Apr 26;8(1):6601. doi: 10.1038/s41598-018-24826-w.

Using the Generalized Index of Dissimilarity to Detect Gene-Gene Interactions in Multi-Class Phenotypes.使用广义差异指数检测多类表型中的基因-基因相互作用。

PLoS One. 2016 Aug 24;11(8):e0158668. doi: 10.1371/journal.pone.0158668. eCollection 2016.

Statistical Issues in the Design and Analysis of nCounter Projects.nCounter项目设计与分析中的统计学问题

Cancer Inform. 2014 Dec 14;13(Suppl 7):35-43. doi: 10.4137/CIN.S16343. eCollection 2014.

A modified entropy-based approach for identifying gene-gene interactions in case-control study.基于改进的熵方法的病例对照研究中基因-基因交互作用的识别。

PLoS One. 2013 Jul 18;8(7):e69321. doi: 10.1371/journal.pone.0069321. Print 2013.

A gene selection method for GeneChip array data with small sample sizes.一种适用于小样本量 GeneChip 阵列数据的基因选择方法。

BMC Genomics. 2011 Dec 23;12 Suppl 5(Suppl 5):S7. doi: 10.1186/1471-2164-12-S5-S7.

Statistical considerations for analysis of microarray experiments.用于微阵列实验分析的统计考虑因素。

Clin Transl Sci. 2011 Dec;4(6):466-77. doi: 10.1111/j.1752-8062.2011.00309.x. Epub 2011 Nov 7.

Nonparametric methods for the analysis of single-color pathogen microarrays.非参数方法在单色彩病原体微阵列分析中的应用。

BMC Bioinformatics. 2010 Jun 28;11:354. doi: 10.1186/1471-2105-11-354.

Sample size calculation for microarray experiments with blocked one-way design.具有区组单因素设计的微阵列实验的样本量计算

BMC Bioinformatics. 2009 May 28;10:164. doi: 10.1186/1471-2105-10-164.

Estimating the false discovery rate using mixed normal distribution for identifying differentially expressed genes in microarray data analysis.在微阵列数据分析中使用混合正态分布估计错误发现率以识别差异表达基因。

Cancer Inform. 2008 Jan 22;3:140-8.

Effects of dependence in high-dimensional multiple testing problems.高维多重检验问题中相依性的影响。

BMC Bioinformatics. 2008 Feb 25;9:114. doi: 10.1186/1471-2105-9-114.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

在分析微阵列数据时，我们能多精确地控制错误发现率？

How accurately can we control the FDR in analyzing microarray data?

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献