多重检验中错误发现率的估计：应用于基因微阵列数据。

Estimation of false discovery rates in multiple testing: application to gene microarray data.

作者信息

Tsai Chen-An, Hsueh Huey-miin, Chen James J

机构信息

Division of Biometry and Risk Assessment, National Center for Toxicological Research, Food and Drug Administration, Jefferson, Arkansas, USA.

出版信息

Biometrics. 2003 Dec;59(4):1071-81. doi: 10.1111/j.0006-341x.2003.00123.x.

DOI:10.1111/j.0006-341x.2003.00123.x

PMID:14969487

Abstract

Testing for significance with gene expression data from DNA microarray experiments involves simultaneous comparisons of hundreds or thousands of genes. If R denotes the number of rejections (declared significant genes) and V denotes the number of false rejections, then V/R, if R > 0, is the proportion of false rejected hypotheses. This paper proposes a model for the distribution of the number of rejections and the conditional distribution of V given R, V / R. Under the independence assumption, the distribution of R is a convolution of two binomials and the distribution of V / R has a noncentral hypergeometric distribution. Under an equicorrelated model, the distributions are more complex and are also derived. Five false discovery rate probability error measures are considered: FDR = E(V/R), pFDR = E(V/R / R > 0) (positive FDR), cFDR = E(V/R / R = r) (conditional FDR), mFDR = E(V)/E(R) (marginal FDR), and eFDR = E(V)/r (empirical FDR). The pFDR, cFDR, and mFDR are shown to be equivalent under the Bayesian framework, in which the number of true null hypotheses is modeled as a random variable. We present a parametric and a bootstrap procedure to estimate the FDRs. Monte Carlo simulations were conducted to evaluate the performance of these two methods. The bootstrap procedure appears to perform reasonably well, even when the alternative hypotheses are correlated (rho = .25). An example from a toxicogenomic microarray experiment is presented for illustration.

摘要

对来自DNA微阵列实验的基因表达数据进行显著性检验涉及同时比较数百个或数千个基因。如果R表示拒绝的数量（宣布为显著的基因），V表示错误拒绝的数量，那么当R>0时，V/R就是错误拒绝假设的比例。本文提出了一个关于拒绝数量分布以及给定R时V的条件分布V/R的模型。在独立性假设下，R的分布是两个二项分布的卷积，V/R的分布具有非中心超几何分布。在等相关模型下，分布更为复杂，也已推导得出。考虑了五种错误发现率概率误差度量：FDR = E(V/R)，pFDR = E(V/R / R > 0)（正FDR），cFDR = E(V/R / R = r)（条件FDR），mFDR = E(V)/E(R)（边际FDR），以及eFDR = E(V)/r（经验FDR）。在贝叶斯框架下，pFDR、cFDR和mFDR被证明是等价的，其中真零假设的数量被建模为一个随机变量。我们提出了一种参数化方法和一种自助法来估计错误发现率。进行了蒙特卡罗模拟以评估这两种方法的性能。即使在备择假设相关（rho = 0.25）的情况下，自助法似乎也表现得相当不错。给出了一个来自毒理基因组微阵列实验的例子进行说明。

相似文献

Estimation of false discovery rates in multiple testing: application to gene microarray data.多重检验中错误发现率的估计：应用于基因微阵列数据。

Biometrics. 2003 Dec;59(4):1071-81. doi: 10.1111/j.0006-341x.2003.00123.x.

Resampling-based empirical Bayes multiple testing procedures for controlling generalized tail probability and expected value error rates: focus on the false discovery rate and simulation study.基于重采样的经验贝叶斯多重检验程序，用于控制广义尾概率和期望值错误率：聚焦于错误发现率及模拟研究

Biom J. 2008 Oct;50(5):716-44. doi: 10.1002/bimj.200710473.

Empirical Bayes screening of many p-values with applications to microarray studies.用于微阵列研究的多p值经验贝叶斯筛选。

Bioinformatics. 2005 May 1;21(9):1987-94. doi: 10.1093/bioinformatics/bti301. Epub 2005 Feb 2.

A mixture model for estimating the local false discovery rate in DNA microarray analysis.一种用于估计DNA微阵列分析中局部错误发现率的混合模型。

Bioinformatics. 2004 Nov 1;20(16):2694-701. doi: 10.1093/bioinformatics/bth310. Epub 2004 May 14.

Re-sampling strategy to improve the estimation of number of null hypotheses in FDR control under strong correlation structures.在强相关结构下改进错误发现率（FDR）控制中零假设数量估计的重采样策略。

BMC Bioinformatics. 2007 May 18;8:157. doi: 10.1186/1471-2105-8-157.

Improving false discovery rate estimation.改进错误发现率估计。

Bioinformatics. 2004 Jul 22;20(11):1737-45. doi: 10.1093/bioinformatics/bth160. Epub 2004 Feb 26.

Comparison of methods for estimating the number of true null hypotheses in multiplicity testing.多重检验中估计真零假设数量方法的比较。

J Biopharm Stat. 2003 Nov;13(4):675-89. doi: 10.1081/BIP-120024202.

The Beta-Binomial Distribution for Estimating the Number of False Rejections in Microarray Gene Expression Studies.用于估计微阵列基因表达研究中错误拒绝数量的贝塔-二项分布。

Comput Stat Data Anal. 2009 Mar 15;53(5):1688-1700. doi: 10.1016/j.csda.2008.01.013.

The false discovery rate: a key concept in large-scale genetic studies.假发现率：大规模遗传研究中的关键概念。

Cancer Control. 2010 Jan;17(1):58-62. doi: 10.1177/107327481001700108.

Construction of null statistics in permutation-based multiple testing for multi-factorial microarray experiments.基于排列的多因素微阵列实验多重检验中零统计量的构建。

Bioinformatics. 2006 Jun 15;22(12):1486-94. doi: 10.1093/bioinformatics/btl109. Epub 2006 Mar 30.

引用本文的文献

Application of multiple testing procedures for identifying relevant comorbidities, from a large set, in traumatic brain injury for research applications utilizing big health-administrative data.利用大型健康管理数据，将多重检验程序应用于从大量数据集中识别创伤性脑损伤中相关合并症，以用于研究。

Front Big Data. 2022 Sep 28;5:793606. doi: 10.3389/fdata.2022.793606. eCollection 2022.

Registered Report: Transcriptional Analysis of Savings Memory Suggests Forgetting is Due to Retrieval Failure.注册报告：储蓄记忆的转录分析表明遗忘是由于检索失败。

eNeuro. 2020 Nov 12;7(6). doi: 10.1523/ENEURO.0313-19.2020. Print 2020 Nov/Dec.

Time-Dependent miRNA Profile during Septic Acute Kidney Injury in Mice.小鼠脓毒症急性肾损伤中时间依赖性 miRNA 谱。

Int J Mol Sci. 2020 Jul 27;21(15):5316. doi: 10.3390/ijms21155316.

Transcriptional correlates of memory maintenance following long-term sensitization of .长期致敏后记忆维持的转录相关性。你提供的原文似乎不完整，“of”后面缺少具体内容。

Learn Mem. 2017 Sep 15;24(10):502-515. doi: 10.1101/lm.045450.117. Print 2017 Oct.

A gene-signature progression approach to identifying candidate small-molecule cancer therapeutics with connectivity mapping.一种利用连接性图谱鉴定候选小分子癌症治疗药物的基因特征进展方法。

BMC Bioinformatics. 2016 May 11;17(1):211. doi: 10.1186/s12859-016-1066-x.

ECR-MAPK regulation in liver early development.肝脏早期发育中的ECR-MAPK调控

Biomed Res Int. 2014;2014:850802. doi: 10.1155/2014/850802. Epub 2014 Dec 18.

The effect of colonoscopy on whole blood gene expression profile: an experimental investigation for colorectal cancer biomarker discovery.结肠镜检查对全血基因表达谱的影响：一项用于发现结直肠癌生物标志物的实验研究。

J Cancer Res Clin Oncol. 2015 Apr;141(4):591-9. doi: 10.1007/s00432-014-1837-6. Epub 2014 Oct 12.

Characterization of the rapid transcriptional response to long-term sensitization training in Aplysia californica.加州海兔长期敏感化训练快速转录反应的特征分析。

Neurobiol Learn Mem. 2014 Dec;116:27-35. doi: 10.1016/j.nlm.2014.07.009. Epub 2014 Aug 10.

A novel three serum phospholipid panel differentiates normal individuals from those with prostate cancer.一种新型三血清磷脂谱可区分正常个体和前列腺癌患者。

PLoS One. 2014 Mar 6;9(3):e88841. doi: 10.1371/journal.pone.0088841. eCollection 2014.

A methylome-wide study of aging using massively parallel sequencing of the methyl-CpG-enriched genomic fraction from blood in over 700 subjects.一项针对700多名受试者，利用对血液中富含甲基化胞嘧啶-磷酸鸟嘌呤（methyl-CpG）的基因组片段进行大规模平行测序开展的全基因组甲基化衰老研究。

Hum Mol Genet. 2014 Mar 1;23(5):1175-85. doi: 10.1093/hmg/ddt511. Epub 2013 Oct 16.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

多重检验中错误发现率的估计：应用于基因微阵列数据。

Estimation of false discovery rates in multiple testing: application to gene microarray data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献