稀疏依赖下真实零假设比例的估计：微阵列数据中的自适应 FDR 控制。

Estimation of the proportion of true null hypotheses under sparse dependence: Adaptive FDR controlling in microarray data.

机构信息

Department of Statistics, 28675Dibrugarh University, Dibrugarh, Assam, India.

Center for Biotechnology and Bioinformatics, 28675Dibrugarh University, Dibrugarh, Assam, India.

出版信息

Stat Methods Med Res. 2022 May;31(5):917-927. doi: 10.1177/09622802221074164. Epub 2022 Feb 8.

DOI:10.1177/09622802221074164

PMID:35133933

Abstract

The proportion of non-differentially expressed genes is an important quantity in microarray data analysis and an appropriate estimate of the same is used to construct adaptive multiple testing procedures. Most of the estimators for the proportion of true null hypotheses based on the thresholding, maximum likelihood and density estimation approaches assume independence among the gene expressions. Usually, sparse dependence structure is natural in modelling associations in microarray gene expression data and hence it is necessary to develop methods for accommodating the sparse dependence well within the framework of existing estimators. We propose a clustering based method to put genes in the same group that are not coexpressed using the estimated high dimensional correlation structure under sparse assumption as dissimilarity matrix. This novel method is applied to three existing estimators for the proportion of true null hypotheses. Extensive simulation study shows that the proposed method improves an existing estimator by making it less conservative and the corresponding adaptive Benjamini-Hochberg algorithm more powerful. The proposed method is applied to a microarray gene expression dataset of colorectal cancer patients and the results show gain in terms of number of differentially expressed genes. The R code is available at https://github.com/aniketstat/Proportiontion-of-true-null-under-sparse-dependence-2021.

摘要

非差异表达基因的比例是微阵列数据分析中的一个重要数量，适当的估计值可用于构建适应性多重检验程序。基于阈值、最大似然和密度估计方法的大多数真实零假设比例估计器都假设基因表达之间是独立的。通常，在微阵列基因表达数据中建模关联时，稀疏依赖结构是自然的，因此有必要在现有估计器的框架内很好地开发适应稀疏依赖的方法。我们提出了一种基于聚类的方法，使用稀疏假设下估计的高维相关结构作为不相似性矩阵，将不共表达的基因放在同一组中。这种新方法应用于三个现有的真实零假设比例估计器。广泛的模拟研究表明，该方法通过降低现有估计器的保守性，使相应的自适应 Benjamini-Hochberg 算法更加强大。该方法应用于结直肠癌患者的微阵列基因表达数据集，结果表明在差异表达基因的数量方面有了提高。R 代码可在 https://github.com/aniketstat/Proportiontion-of-true-null-under-sparse-dependence-2021 上获得。

相似文献

Estimation of the proportion of true null hypotheses under sparse dependence: Adaptive FDR controlling in microarray data.

Stat Methods Med Res. 2022 May;31(5):917-927. doi: 10.1177/09622802221074164. Epub 2022 Feb 8.

Re-sampling strategy to improve the estimation of number of null hypotheses in FDR control under strong correlation structures.

BMC Bioinformatics. 2007 May 18;8:157. doi: 10.1186/1471-2105-8-157.

Effects of dependence in high-dimensional multiple testing problems.

BMC Bioinformatics. 2008 Feb 25;9:114. doi: 10.1186/1471-2105-9-114.

Multiple testing with discrete data: Proportion of true null hypotheses and two adaptive FDR procedures.

Biom J. 2018 Jul;60(4):761-779. doi: 10.1002/bimj.201700157. Epub 2018 May 11.

SLIM: a sliding linear model for estimating the proportion of true null hypotheses in datasets with dependence structures.

Bioinformatics. 2011 Jan 15;27(2):225-31. doi: 10.1093/bioinformatics/btq650. Epub 2010 Nov 18.

On correcting the overestimation of the permutation-based false discovery rate estimator.

Bioinformatics. 2008 Aug 1;24(15):1655-61. doi: 10.1093/bioinformatics/btn310. Epub 2008 Jun 23.

Controlling false discoveries in multidimensional directional decisions, with applications to gene expression data on ordered categories.

Biometrics. 2010 Jun;66(2):485-92. doi: 10.1111/j.1541-0420.2009.01292.x. Epub 2009 Jul 23.

Bias-corrected estimators for proportion of true null hypotheses: application of adaptive FDR-controlling in segmented failure data.

J Appl Stat. 2021 Jul 27;49(14):3591-3613. doi: 10.1080/02664763.2021.1957790. eCollection 2022.

Estimating the proportion of true null hypotheses and adaptive false discovery rate control in discrete paradigm.

Biom J. 2024 Mar;66(2):e2200204. doi: 10.1002/bimj.202200204.

Construction of null statistics in permutation-based multiple testing for multi-factorial microarray experiments.

Bioinformatics. 2006 Jun 15;22(12):1486-94. doi: 10.1093/bioinformatics/btl109. Epub 2006 Mar 30.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

稀疏依赖下真实零假设比例的估计：微阵列数据中的自适应 FDR 控制。

Estimation of the proportion of true null hypotheses under sparse dependence: Adaptive FDR controlling in microarray data.

机构信息

Department of Statistics, 28675Dibrugarh University, Dibrugarh, Assam, India.

Center for Biotechnology and Bioinformatics, 28675Dibrugarh University, Dibrugarh, Assam, India.

出版信息

Stat Methods Med Res. 2022 May;31(5):917-927. doi: 10.1177/09622802221074164. Epub 2022 Feb 8.

DOI:10.1177/09622802221074164

PMID:35133933

Abstract

摘要

稀疏依赖下真实零假设比例的估计：微阵列数据中的自适应 FDR 控制。

Estimation of the proportion of true null hypotheses under sparse dependence: Adaptive FDR controlling in microarray data.

机构信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

稀疏依赖下真实零假设比例的估计：微阵列数据中的自适应 FDR 控制。

Estimation of the proportion of true null hypotheses under sparse dependence: Adaptive FDR controlling in microarray data.

机构信息

出版信息

相似文献