Suppr超能文献

ConReg-R:对经验 p 值分布进行外推性再校准,以改进错误发现率估计。

ConReg-R: Extrapolative recalibration of the empirical distribution of p-values to improve false discovery rate estimates.

机构信息

Computational & Mathematical Biology, Genome Institute of Singapore, 60 Biopolis Street, Singapore 138672, Singapore.

出版信息

Biol Direct. 2011 May 20;6:27. doi: 10.1186/1745-6150-6-27.

Abstract

BACKGROUND

False discovery rate (FDR) control is commonly accepted as the most appropriate error control in multiple hypothesis testing problems. The accuracy of FDR estimation depends on the accuracy of the estimation of p-values from each test and validity of the underlying assumptions of the distribution. However, in many practical testing problems such as in genomics, the p-values could be under-estimated or over-estimated for many known or unknown reasons. Consequently, FDR estimation would then be influenced and lose its veracity.

RESULTS

We propose a new extrapolative method called Constrained Regression Recalibration (ConReg-R) to recalibrate the empirical p-values by modeling their distribution to improve the FDR estimates. Our ConReg-R method is based on the observation that accurately estimated p-values from true null hypotheses follow uniform distribution and the observed distribution of p-values is indeed a mixture of distributions of p-values from true null hypotheses and true alternative hypotheses. Hence, ConReg-R recalibrates the observed p-values so that they exhibit the properties of an ideal empirical p-value distribution. The proportion of true null hypotheses (π0) and FDR are estimated after the recalibration.

CONCLUSIONS

ConReg-R provides an efficient way to improve the FDR estimates. It only requires the p-values from the tests and avoids permutation of the original test data. We demonstrate that the proposed method significantly improves FDR estimation on several gene expression datasets obtained from microarray and RNA-seq experiments.

摘要

背景

错误发现率(FDR)控制通常被认为是多重假设检验问题中最合适的误差控制方法。FDR 的估计准确性取决于每个检验的 p 值的估计准确性以及分布的基本假设的有效性。然而,在许多实际的检验问题中,例如在基因组学中,由于许多已知或未知的原因,p 值可能被低估或高估。因此,FDR 的估计会受到影响,失去其真实性。

结果

我们提出了一种新的外推方法,称为约束回归再校准(ConReg-R),通过对其分布进行建模来重新校准经验 p 值,从而改善 FDR 估计。我们的 ConReg-R 方法基于以下观察结果:来自真实零假设的准确估计的 p 值遵循均匀分布,并且观察到的 p 值分布实际上是来自真实零假设和真实替代假设的 p 值分布的混合。因此,ConReg-R 重新校准观察到的 p 值,以使它们表现出理想的经验 p 值分布的特性。重新校准后,估计了真实零假设的比例(π0)和 FDR。

结论

ConReg-R 提供了一种有效改善 FDR 估计的方法。它只需要检验的 p 值,并且避免了原始检验数据的置换。我们证明,该方法在从微阵列和 RNA-seq 实验获得的几个基因表达数据集中显著提高了 FDR 估计。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63b7/3130718/5f6ef711d388/1745-6150-6-27-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验