Suppr超能文献

一种用于确定重复实验中秩积统计量的界和精确近似p值的快速算法。

A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments.

作者信息

Heskes Tom, Eisinga Rob, Breitling Rainer

机构信息

Institute for Computing and Information Sciences, Radboud University Nijmegen, Nijmegen, The Netherlands.

Department of Social Science Research Methods, Radboud University Nijmegen, Nijmegen, The Netherlands.

出版信息

BMC Bioinformatics. 2014 Nov 21;15(1):367. doi: 10.1186/s12859-014-0367-1.

Abstract

BACKGROUND

The rank product method is a powerful statistical technique for identifying differentially expressed molecules in replicated experiments. A critical issue in molecule selection is accurate calculation of the p-value of the rank product statistic to adequately address multiple testing. Both exact calculation and permutation and gamma approximations have been proposed to determine molecule-level significance. These current approaches have serious drawbacks as they are either computationally burdensome or provide inaccurate estimates in the tail of the p-value distribution.

RESULTS

We derive strict lower and upper bounds to the exact p-value along with an accurate approximation that can be used to assess the significance of the rank product statistic in a computationally fast manner. The bounds and the proposed approximation are shown to provide far better accuracy over existing approximate methods in determining tail probabilities, with the slightly conservative upper bound protecting against false positives. We illustrate the proposed method in the context of a recently published analysis on transcriptomic profiling performed in blood.

CONCLUSIONS

We provide a method to determine upper bounds and accurate approximate p-values of the rank product statistic. The proposed algorithm provides an order of magnitude increase in throughput as compared with current approaches and offers the opportunity to explore new application domains with even larger multiple testing issue. The R code is published in one of the Additional files and is available at http://www.ru.nl/publish/pages/726696/rankprodbounds.zip .

摘要

背景

秩乘积法是一种用于在重复实验中识别差异表达分子的强大统计技术。分子选择中的一个关键问题是准确计算秩乘积统计量的p值,以充分解决多重检验问题。已提出精确计算、置换和伽马近似来确定分子水平的显著性。这些当前方法存在严重缺陷,因为它们要么计算量很大,要么在p值分布的尾部提供不准确的估计。

结果

我们推导出精确p值的严格上下界以及一种精确近似,可用于以计算快速的方式评估秩乘积统计量的显著性。在确定尾部概率方面,这些界和所提出的近似比现有的近似方法具有更高的准确性,稍微保守的上界可防止假阳性。我们在最近发表的一项血液转录组分析的背景下说明了所提出的方法。

结论

我们提供了一种确定秩乘积统计量的上界和精确近似p值的方法。与当前方法相比,所提出的算法在通量上提高了一个数量级,并为探索具有更大多重检验问题的新应用领域提供了机会。R代码在一个附加文件中发布,可在http://www.ru.nl/publish/pages/726696/rankprodbounds.zip获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b23/4245829/001038e39c42/12859_2014_367_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验