相依情形下广义 Fisher 组合检验与精确检验 p 值的计算

The generalized Fisher's combination and accurate p-value calculation under dependence.

机构信息

Biostatistics and Research Decision Sciences, Merck Research Laboratories, Rahway, New Jersey, USA.

Department of Mathematical Sciences, Worcester Polytechnic Institute, Worcester, Massachusetts, USA.

出版信息

Biometrics. 2023 Jun;79(2):1159-1172. doi: 10.1111/biom.13634. Epub 2022 Mar 9.

DOI:10.1111/biom.13634

PMID:35178716

Abstract

Combining dependent tests of significance has broad applications but the related p-value calculation is challenging. For Fisher's combination test, current p-value calculation methods (eg, Brown's approximation) tend to inflate the type I error rate when the desired significance level is substantially less than 0.05. The problem could lead to significant false discoveries in big data analyses. This paper provides two main contributions. First, it presents a general family of Fisher type statistics, referred to as the GFisher, which covers many classic statistics, such as Fisher's combination, Good's statistic, Lancaster's statistic, weighted Z-score combination, and so forth. The GFisher allows a flexible weighting scheme, as well as an omnibus procedure that automatically adapts proper weights and the statistic-defining parameters to a given data. Second, the paper presents several new p-value calculation methods based on two novel ideas: moment-ratio matching and joint-distribution surrogating. Systematic simulations show that the new calculation methods are more accurate under multivariate Gaussian, and more robust under the generalized linear model and the multivariate t-distribution. The applications of the GFisher and the new p-value calculation methods are demonstrated by a gene-based single nucleotide polymorphism (SNP)-set association study. Relevant computation has been implemented to an R package GFisher available on the Comprehensive R Archive Network.

摘要

合并依赖的显著性检验具有广泛的应用，但相关的 p 值计算具有挑战性。对于 Fisher 的合并检验，当期望的显著性水平远小于 0.05 时，当前的 p 值计算方法（例如，Brown 的近似）往往会导致Ⅰ类错误率膨胀。这个问题可能导致大数据分析中的重大错误发现。本文主要有两个贡献。首先，它提出了一个一般的 Fisher 型统计量族，称为 GFisher，它涵盖了许多经典的统计量，如 Fisher 的合并、Good 的统计量、Lancaster 的统计量、加权 Z 得分合并等。GFisher 允许灵活的加权方案，以及一个综合程序，它可以自动适应适当的权重和定义参数给给定的数据。其次，本文提出了几种新的 p 值计算方法，基于两个新的想法：矩比匹配和联合分布替代。系统的模拟表明，新的计算方法在多元高斯分布下更准确，在广义线性模型和多元 t 分布下更稳健。GFisher 和新的 p 值计算方法的应用通过一个基于基因的单核苷酸多态性（SNP）集合关联研究来展示。相关的计算已经实现到一个可在 Comprehensive R Archive Network 上获得的 R 包 GFI sher 中。

相似文献

The generalized Fisher's combination and accurate p-value calculation under dependence.

Biometrics. 2023 Jun;79(2):1159-1172. doi: 10.1111/biom.13634. Epub 2022 Mar 9.

Optimally weighted Z-test is a powerful method for combining probabilities in meta-analysis.

J Evol Biol. 2011 Aug;24(8):1836-41. doi: 10.1111/j.1420-9101.2011.02297.x. Epub 2011 May 23.

Simultaneous detection of novel genes and SNPs by adaptive -value combination.

Front Genet. 2022 Nov 17;13:1009428. doi: 10.3389/fgene.2022.1009428. eCollection 2022.

A gene based combination test using GWAS summary data.

BMC Bioinformatics. 2023 Jan 3;24(1):2. doi: 10.1186/s12859-022-05114-x.

Cauchy combination test: a powerful test with analytic -value calculation under arbitrary dependency structures.

J Am Stat Assoc. 2020;115(529):393-402. doi: 10.1080/01621459.2018.1554485. Epub 2019 Apr 25.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Fisher's method of combining dependent statistics using generalizations of the gamma distribution with applications to genetic pleiotropic associations.

Biostatistics. 2014 Apr;15(2):284-95. doi: 10.1093/biostatistics/kxt045. Epub 2013 Oct 29.

P-value evaluation, variability index and biomarker categorization for adaptively weighted Fisher's meta-analysis method in omics applications.

Bioinformatics. 2020 Jan 15;36(2):524-532. doi: 10.1093/bioinformatics/btz589.

Using Lancaster's mid-P correction to the Fisher's exact test for adverse impact analyses.

J Appl Psychol. 2011 Sep;96(5):956-65. doi: 10.1037/a0024223.

A Comparison of Methods for Gene-Based Testing That Account for Linkage Disequilibrium.

Front Genet. 2022 May 5;13:867724. doi: 10.3389/fgene.2022.867724. eCollection 2022.

引用本文的文献

Federated epidemic surveillance.

PLoS Comput Biol. 2025 Apr 8;21(4):e1012907. doi: 10.1371/journal.pcbi.1012907. eCollection 2025 Apr.

Ensemble methods for testing a global null.

J R Stat Soc Series B Stat Methodol. 2024 Apr;86(2):461-486. doi: 10.1093/jrsssb/qkad131. Epub 2023 Nov 30.

A unified combination framework for dependent tests with applications to microbiome association studies.

Biometrics. 2025 Jan 7;81(1). doi: 10.1093/biomtc/ujaf001.

Healthcare workers' knowledge and attitudes regarding artificial intelligence adoption in healthcare: A cross-sectional study.

Heliyon. 2024 Nov 29;10(23):e40775. doi: 10.1016/j.heliyon.2024.e40775. eCollection 2024 Dec 15.

Construction of an immune-related risk score signature for gastric cancer based on multi-omics data.

Sci Rep. 2024 Jan 16;14(1):1422. doi: 10.1038/s41598-024-52087-3.

Construction of a prognostic 6-gene signature for breast cancer based on multi-omics and single-cell data.

Front Oncol. 2023 Nov 21;13:1186858. doi: 10.3389/fonc.2023.1186858. eCollection 2023.

Simultaneous detection of novel genes and SNPs by adaptive -value combination.

Front Genet. 2022 Nov 17;13:1009428. doi: 10.3389/fgene.2022.1009428. eCollection 2022.

Recent advances and challenges of rare variant association analysis in the biobank sequencing era.

Front Genet. 2022 Oct 6;13:1014947. doi: 10.3389/fgene.2022.1014947. eCollection 2022.

Evaluating statistical significance in a -analysis by using numerical integration.

Comput Struct Biotechnol J. 2022 Jul 4;20:3615-3620. doi: 10.1016/j.csbj.2022.06.055. eCollection 2022.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

相依情形下广义 Fisher 组合检验与精确检验 p 值的计算

The generalized Fisher's combination and accurate p-value calculation under dependence.

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献