Suppr超能文献

控制程序和虚假发现率的估计及其在低维环境中的应用:实证研究。

Control procedures and estimators of the false discovery rate and their application in low-dimensional settings: an empirical investigation.

机构信息

Institute of Medical Biometry and Informatics, University of Heidelberg, Im Neuenheimer Feld 130.3, 69120, Heidelberg, Germany.

Institute for Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Stefan-Meier-Str. 26, 79104, Freiburg, Germany.

出版信息

BMC Bioinformatics. 2018 Mar 2;19(1):78. doi: 10.1186/s12859-018-2081-x.

Abstract

BACKGROUND

When many (up to millions) of statistical tests are conducted in discovery set analyses such as genome-wide association studies (GWAS), approaches controlling family-wise error rate (FWER) or false discovery rate (FDR) are required to reduce the number of false positive decisions. Some methods were specifically developed in the context of high-dimensional settings and partially rely on the estimation of the proportion of true null hypotheses. However, these approaches are also applied in low-dimensional settings such as replication set analyses that might be restricted to a small number of specific hypotheses. The aim of this study was to compare different approaches in low-dimensional settings using (a) real data from the CKDGen Consortium and (b) a simulation study.

RESULTS

In both application and simulation FWER approaches were less powerful compared to FDR control methods, whether a larger number of hypotheses were tested or not. Most powerful was the q-value method. However, the specificity of this method to maintain true null hypotheses was especially decreased when the number of tested hypotheses was small. In this low-dimensional situation, estimation of the proportion of true null hypotheses was biased.

CONCLUSIONS

The results highlight the importance of a sizeable data set for a reliable estimation of the proportion of true null hypotheses. Consequently, methods relying on this estimation should only be applied in high-dimensional settings. Furthermore, if the focus lies on testing of a small number of hypotheses such as in replication settings, FWER methods rather than FDR methods should be preferred to maintain high specificity.

摘要

背景

当在发现集分析(如全基因组关联研究[GWAS])中进行多达数百万次的统计检验时,需要采用控制总体错误率(FWER)或假发现率(FDR)的方法来减少假阳性决策的数量。一些方法是专门在高维环境中开发的,并部分依赖于对真实零假设比例的估计。然而,这些方法也应用于低维环境,如复制集分析,这些分析可能仅限于少数特定假设。本研究的目的是使用(a)CKDGen 联盟的真实数据和(b)模拟研究,在低维环境中比较不同方法。

结果

无论是测试更多还是更少的假设,在应用和模拟中,FWER 方法都不如 FDR 控制方法有效。最有效的方法是 q 值方法。然而,当测试的假设数量较少时,该方法对维持真实零假设的特异性尤其降低。在这种低维情况下,对真实零假设比例的估计存在偏差。

结论

结果强调了可靠估计真实零假设比例需要大量数据集的重要性。因此,仅应在高维环境中应用依赖于这种估计的方法。此外,如果重点是测试少量假设,如复制设置,则应优先选择 FWER 方法而不是 FDR 方法,以保持高特异性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/608c/5833079/82d2aa861a1e/12859_2018_2081_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验