利用观测值模式估计真零假设的比例。

Estimating the Proportion of True Null Hypotheses Using the Pattern of Observed -values.

作者信息

Tong Tiejun, Feng Zeny, Hilton Julia S, Zhao Hongyu

机构信息

Department of Mathematics, Hong Kong Baptist University, Hong Kong ; Institute of Computational and Theoretical Studies, Hong Kong Baptist University, Hong Kong.

出版信息

J Appl Stat. 2013 Jan 1;40(9):1949-1964. doi: 10.1080/02664763.2013.800035.

DOI:10.1080/02664763.2013.800035

PMID:24078762

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3781956/

Abstract

Estimating the proportion of true null hypotheses, π, has attracted much attention in the recent statistical literature. Besides its apparent relevance for a set of specific scientific hypotheses, an accurate estimate of this parameter is key for many multiple testing procedures. Most existing methods for estimating π in the literature are motivated from the independence assumption of test statistics, which is often not true in reality. Simulations indicate that most existing estimators in the presence of the dependence among test statistics can be poor, mainly due to the increase of variation in these estimators. In this paper, we propose several data-driven methods for estimating π by incorporating the distribution pattern of the observed -values as a practical approach to address potential dependence among test statistics. Specifically, we use a linear fit to give a data-driven estimate for the proportion of true-null -values in (λ, 1] over the whole range [0, 1] instead of using the expected proportion at 1 - λ. We find that the proposed estimators may substantially decrease the variance of the estimated true null proportion and thus improve the overall performance.

摘要

估计真实零假设的比例π在最近的统计文献中备受关注。除了其与一组特定科学假设的明显相关性外，准确估计该参数是许多多重检验程序的关键。文献中大多数现有的估计π的方法都是基于检验统计量的独立性假设，而这在现实中往往并不成立。模拟表明，在检验统计量存在相关性的情况下，大多数现有的估计量可能表现不佳，主要原因是这些估计量的方差增加。在本文中，我们提出了几种数据驱动的方法来估计π，通过纳入观测值的分布模式，作为解决检验统计量之间潜在相关性的一种实用方法。具体来说，我们使用线性拟合来给出[0, 1]整个范围内(λ, 1]中真零值比例的数据驱动估计，而不是使用1 - λ处的期望比例。我们发现，所提出的估计量可能会大幅降低估计的真实零比例的方差，从而提高整体性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f4/3781956/b03290400af3/nihms475055f1.jpg

相似文献

Estimating the Proportion of True Null Hypotheses Using the Pattern of Observed -values.利用观测值模式估计真零假设的比例。

J Appl Stat. 2013 Jan 1;40(9):1949-1964. doi: 10.1080/02664763.2013.800035.

Estimating the proportion of true null hypotheses when the statistics are discrete.当统计量为离散型时估计真零假设的比例。

Bioinformatics. 2015 Jul 15;31(14):2303-9. doi: 10.1093/bioinformatics/btv104. Epub 2015 Mar 2.

Towards accurate estimation of the proportion of true null hypotheses in multiple testing.准确估计多重检验中真实零假设的比例。

PLoS One. 2011 Apr 22;6(4):e18874. doi: 10.1371/journal.pone.0018874.

Comparison of methods for estimating the number of true null hypotheses in multiplicity testing.多重检验中估计真零假设数量方法的比较。

J Biopharm Stat. 2003 Nov;13(4):675-89. doi: 10.1081/BIP-120024202.

Bias and variance reduction in estimating the proportion of true-null hypotheses.在估计真零假设比例时减少偏差和方差

Biostatistics. 2015 Jan;16(1):189-204. doi: 10.1093/biostatistics/kxu029. Epub 2014 Jun 23.

Estimating the proportion of true null hypotheses for multiple comparisons.估计多重比较中真零假设的比例。

Cancer Inform. 2008;6:25-32. Epub 2008 Feb 14.

Estimation of the proportion of true null hypotheses under sparse dependence: Adaptive FDR controlling in microarray data.稀疏依赖下真实零假设比例的估计：微阵列数据中的自适应 FDR 控制。

Stat Methods Med Res. 2022 May;31(5):917-927. doi: 10.1177/09622802221074164. Epub 2022 Feb 8.

Exploring the information in p-values for the analysis and planning of multiple-test experiments.探索用于多测试实验分析和规划的p值中的信息。

Biometrics. 2007 Jun;63(2):483-95. doi: 10.1111/j.1541-0420.2006.00704.x.

SLIM: a sliding linear model for estimating the proportion of true null hypotheses in datasets with dependence structures.SLIM：一种滑动线性模型，用于估计具有依赖结构的数据集的真实零假设比例。

Bioinformatics. 2011 Jan 15;27(2):225-31. doi: 10.1093/bioinformatics/btq650. Epub 2010 Nov 18.

ConReg-R: Extrapolative recalibration of the empirical distribution of p-values to improve false discovery rate estimates.ConReg-R：对经验 p 值分布进行外推性再校准，以改进错误发现率估计。

Biol Direct. 2011 May 20;6:27. doi: 10.1186/1745-6150-6-27.

引用本文的文献

Protein quantitative trait locus analysis in African American and non-Hispanic White individuals.非裔美国人和非西班牙裔白人个体的蛋白质数量性状基因座分析。

Genome Biol. 2025 Jul 10;26(1):200. doi: 10.1186/s13059-025-03671-x.

The impact of co-housing on murine aging studies.共居对小鼠衰老研究的影响。

Geroscience. 2025 Jan 14. doi: 10.1007/s11357-024-01480-x.

Quantifying the Impact of Co-Housing on Murine Aging Studies.量化合居对小鼠衰老研究的影响。

bioRxiv. 2024 Aug 7:2024.08.06.606373. doi: 10.1101/2024.08.06.606373.

Bias-corrected estimators for proportion of true null hypotheses: application of adaptive FDR-controlling in segmented failure data.真实零假设比例的偏差校正估计量：自适应错误发现率控制在分段失效数据中的应用

J Appl Stat. 2021 Jul 27;49(14):3591-3613. doi: 10.1080/02664763.2021.1957790. eCollection 2022.

Bias and variance reduction in estimating the proportion of true-null hypotheses.在估计真零假设比例时减少偏差和方差

Biostatistics. 2015 Jan;16(1):189-204. doi: 10.1093/biostatistics/kxu029. Epub 2014 Jun 23.

本文引用的文献

Bioinformatics. 2011 Jan 15;27(2):225-31. doi: 10.1093/bioinformatics/btq650. Epub 2010 Nov 18.

Estimating the proportion of true null hypotheses for multiple comparisons.估计多重比较中真零假设的比例。

Cancer Inform. 2008;6:25-32. Epub 2008 Feb 14.

Exploring the information in p-values for the analysis and planning of multiple-test experiments.探索用于多测试实验分析和规划的p值中的信息。

Biometrics. 2007 Jun;63(2):483-95. doi: 10.1111/j.1541-0420.2006.00704.x.

Re-sampling strategy to improve the estimation of number of null hypotheses in FDR control under strong correlation structures.在强相关结构下改进错误发现率（FDR）控制中零假设数量估计的重采样策略。

BMC Bioinformatics. 2007 May 18;8:157. doi: 10.1186/1471-2105-8-157.

A moment-based method for estimating the proportion of true null hypotheses and its application to microarray gene expression data.一种基于时刻估计真零假设比例的方法及其在微阵列基因表达数据中的应用。

Biostatistics. 2007 Oct;8(4):744-55. doi: 10.1093/biostatistics/kxm002. Epub 2007 Jan 22.

Parametric and nonparametric FDR estimation revisited.参数化和非参数化错误发现率估计的再探讨。

Biometrics. 2006 Sep;62(3):735-44. doi: 10.1111/j.1541-0420.2006.00531.x.

A comparative review of estimates of the proportion unchanged genes and the false discovery rate.对未改变基因比例估计值和错误发现率的比较性综述。

BMC Bioinformatics. 2005 Aug 8;6:199. doi: 10.1186/1471-2105-6-199.

Improved statistical tests for differential gene expression by shrinking variance components estimates.通过收缩方差分量估计改进差异基因表达的统计检验。

Biostatistics. 2005 Jan;6(1):59-75. doi: 10.1093/biostatistics/kxh018.

A simple procedure for estimating the false discovery rate.一种估计错误发现率的简单方法。

Bioinformatics. 2005 Mar 1;21(5):660-8. doi: 10.1093/bioinformatics/bti063. Epub 2004 Oct 12.

Improving false discovery rate estimation.改进错误发现率估计。

Bioinformatics. 2004 Jul 22;20(11):1737-45. doi: 10.1093/bioinformatics/bth160. Epub 2004 Feb 26.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验