一种用于确定重复实验中秩积统计量的界和精确近似p值的快速算法。

A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments.

作者信息

Heskes Tom, Eisinga Rob, Breitling Rainer

机构信息

Institute for Computing and Information Sciences, Radboud University Nijmegen, Nijmegen, The Netherlands.

Department of Social Science Research Methods, Radboud University Nijmegen, Nijmegen, The Netherlands.

出版信息

BMC Bioinformatics. 2014 Nov 21;15(1):367. doi: 10.1186/s12859-014-0367-1.

DOI:10.1186/s12859-014-0367-1

PMID:25413493

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4245829/

Abstract

BACKGROUND

The rank product method is a powerful statistical technique for identifying differentially expressed molecules in replicated experiments. A critical issue in molecule selection is accurate calculation of the p-value of the rank product statistic to adequately address multiple testing. Both exact calculation and permutation and gamma approximations have been proposed to determine molecule-level significance. These current approaches have serious drawbacks as they are either computationally burdensome or provide inaccurate estimates in the tail of the p-value distribution.

RESULTS

We derive strict lower and upper bounds to the exact p-value along with an accurate approximation that can be used to assess the significance of the rank product statistic in a computationally fast manner. The bounds and the proposed approximation are shown to provide far better accuracy over existing approximate methods in determining tail probabilities, with the slightly conservative upper bound protecting against false positives. We illustrate the proposed method in the context of a recently published analysis on transcriptomic profiling performed in blood.

CONCLUSIONS

We provide a method to determine upper bounds and accurate approximate p-values of the rank product statistic. The proposed algorithm provides an order of magnitude increase in throughput as compared with current approaches and offers the opportunity to explore new application domains with even larger multiple testing issue. The R code is published in one of the Additional files and is available at http://www.ru.nl/publish/pages/726696/rankprodbounds.zip .

摘要

背景

秩乘积法是一种用于在重复实验中识别差异表达分子的强大统计技术。分子选择中的一个关键问题是准确计算秩乘积统计量的p值，以充分解决多重检验问题。已提出精确计算、置换和伽马近似来确定分子水平的显著性。这些当前方法存在严重缺陷，因为它们要么计算量很大，要么在p值分布的尾部提供不准确的估计。

结果

我们推导出精确p值的严格上下界以及一种精确近似，可用于以计算快速的方式评估秩乘积统计量的显著性。在确定尾部概率方面，这些界和所提出的近似比现有的近似方法具有更高的准确性，稍微保守的上界可防止假阳性。我们在最近发表的一项血液转录组分析的背景下说明了所提出的方法。

结论

我们提供了一种确定秩乘积统计量的上界和精确近似p值的方法。与当前方法相比，所提出的算法在通量上提高了一个数量级，并为探索具有更大多重检验问题的新应用领域提供了机会。R代码在一个附加文件中发布，可在http://www.ru.nl/publish/pages/726696/rankprodbounds.zip获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1b23/4245829/001038e39c42/12859_2014_367_Fig1_HTML.jpg

相似文献

A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments.

BMC Bioinformatics. 2014 Nov 21;15(1):367. doi: 10.1186/s12859-014-0367-1.

Exact p-values for pairwise comparison of Friedman rank sums, with application to comparing classifiers.

BMC Bioinformatics. 2017 Jan 25;18(1):68. doi: 10.1186/s12859-017-1486-2.

The exact probability distribution of the rank product statistics for replicated experiments.

FEBS Lett. 2013 Mar 18;587(6):677-82. doi: 10.1016/j.febslet.2013.01.037. Epub 2013 Feb 8.

Construction of null statistics in permutation-based multiple testing for multi-factorial microarray experiments.

Bioinformatics. 2006 Jun 15;22(12):1486-94. doi: 10.1093/bioinformatics/btl109. Epub 2006 Mar 30.

ExactFDR: exact computation of false discovery rate estimate in case-control association studies.

Bioinformatics. 2008 Oct 15;24(20):2407-8. doi: 10.1093/bioinformatics/btn379. Epub 2008 Jul 28.

Moment based gene set tests.

BMC Bioinformatics. 2015 Apr 28;16:132. doi: 10.1186/s12859-015-0571-7.

Fewer permutations, more accurate P-values.

Bioinformatics. 2009 Jun 15;25(12):i161-8. doi: 10.1093/bioinformatics/btp211.

Normal uniform mixture differential gene expression detection for cDNA microarrays.

BMC Bioinformatics. 2005 Jul 12;6:173. doi: 10.1186/1471-2105-6-173.

Estimation of interindividual pharmacokinetic variability factor for inhaled volatile organic chemicals using a probability-bounds approach.

Regul Toxicol Pharmacol. 2007 Jun;48(1):93-101. doi: 10.1016/j.yrtph.2007.01.008. Epub 2007 Feb 4.

Statistical significance approximation in local trend analysis of high-throughput time-series data using the theory of Markov chains.

BMC Bioinformatics. 2015 Sep 21;16:301. doi: 10.1186/s12859-015-0732-8.

引用本文的文献

ToxiPep: Peptide toxicity prediction via fusion of context-aware representation and atomic-level graph.

Comput Struct Biotechnol J. 2025 May 28;27:2347-2358. doi: 10.1016/j.csbj.2025.05.039. eCollection 2025.

Stratification of enterochromaffin cells by single-cell expression analysis.

Elife. 2025 Apr 4;12:RP90596. doi: 10.7554/eLife.90596.

Nucleotide metabolism in cancer cells fuels a UDP-driven macrophage cross-talk, promoting immunosuppression and immunotherapy resistance.

Nat Cancer. 2024 Aug;5(8):1206-1226. doi: 10.1038/s43018-024-00771-8. Epub 2024 Jun 6.

Stratification of enterochromaffin cells by single-cell expression analysis.

bioRxiv. 2025 Jan 23:2023.08.24.554649. doi: 10.1101/2023.08.24.554649.

Immunoproteasome-specific subunit PSMB9 induction is required to regulate cellular proteostasis upon mitochondrial dysfunction.

Nat Commun. 2023 Jul 11;14(1):4092. doi: 10.1038/s41467-023-39642-8.

Prioritization and functional validation of target genes from single-cell transcriptomics studies.

Commun Biol. 2023 Jun 17;6(1):648. doi: 10.1038/s42003-023-05006-7.

Transcriptomic profile comparison of monocytes from rheumatoid arthritis patients in treatment with methotrexate, anti-TNFa, abatacept or tocilizumab.

PLoS One. 2023 Mar 6;18(3):e0282564. doi: 10.1371/journal.pone.0282564. eCollection 2023.

A Counterintuitive Neutrophil-Mediated Pattern in COVID-19 Patients Revealed through Transcriptomics Analysis.

Viruses. 2022 Dec 30;15(1):104. doi: 10.3390/v15010104.

The Transcriptional Landscape of Wild Type Metastatic Melanoma: A Pilot Study.

Int J Mol Sci. 2022 Jun 21;23(13):6898. doi: 10.3390/ijms23136898.

ROSIE: RObust Sparse ensemble for outlIEr detection and gene selection in cancer omics data.

Stat Methods Med Res. 2022 May;31(5):947-958. doi: 10.1177/09622802211072456. Epub 2022 Jan 24.

本文引用的文献

Integrative framework for identification of key cell identity genes uncovers determinants of ES cell identity and homeostasis.

Proc Natl Acad Sci U S A. 2014 Apr 22;111(16):E1581-90. doi: 10.1073/pnas.1318598111. Epub 2014 Apr 7.

Global meta-analysis of transcriptomics studies.

PLoS One. 2014 Feb 26;9(2):e89318. doi: 10.1371/journal.pone.0089318. eCollection 2014.

Fold change rank ordering statistics: a new method for detecting differentially expressed genes.

BMC Bioinformatics. 2014 Jan 15;15:14. doi: 10.1186/1471-2105-15-14.

Meta-analysis methods for combining multiple expression profiles: comparisons, statistical characterization and an application guideline.

BMC Bioinformatics. 2013 Dec 21;14:368. doi: 10.1186/1471-2105-14-368.

Meta-analysis on blood transcriptomic studies identifies consistently coexpressed protein-protein interaction modules as robust markers of human aging.

Aging Cell. 2014 Apr;13(2):216-25. doi: 10.1111/acel.12160. Epub 2013 Nov 19.

The exact probability distribution of the rank product statistics for replicated experiments.

FEBS Lett. 2013 Mar 18;587(6):677-82. doi: 10.1016/j.febslet.2013.01.037. Epub 2013 Feb 8.

A linear classifier based on entity recognition tools and a statistical approach to method extraction in the protein-protein interaction literature.

BMC Bioinformatics. 2011 Oct 3;12 Suppl 8(Suppl 8):S12. doi: 10.1186/1471-2105-12-S8-S12.

Consistent Differential Expression Pattern (CDEP) on microarray to identify genes related to metastatic behavior.

BMC Bioinformatics. 2011 Nov 11;12:438. doi: 10.1186/1471-2105-12-438.

Metabolomic correlation-network modules in Arabidopsis based on a graph-clustering approach.

BMC Syst Biol. 2011 Jan 1;5:1. doi: 10.1186/1752-0509-5-1.

Comments on the rank product method for analyzing replicated experiments.

FEBS Lett. 2010 Mar 5;584(5):941-4. doi: 10.1016/j.febslet.2010.01.031. Epub 2010 Jan 20.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种用于确定重复实验中秩积统计量的界和精确近似p值的快速算法。

A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献