Suppr超能文献

Kerfdr:一种基于半参数核的局部错误发现率估计方法。

Kerfdr: a semi-parametric kernel-based approach to local false discovery rate estimation.

作者信息

Guedj Mickael, Robin Stephane, Celisse Alain, Nuel Gregory

机构信息

Statistics and Genome laboratory, CNRS UMR8071, INRA U1152, University of Evry, Evry, France.

出版信息

BMC Bioinformatics. 2009 Mar 16;10:84. doi: 10.1186/1471-2105-10-84.

Abstract

BACKGROUND

The use of current high-throughput genetic, genomic and post-genomic data leads to the simultaneous evaluation of a large number of statistical hypothesis and, at the same time, to the multiple-testing problem. As an alternative to the too conservative Family-Wise Error-Rate (FWER), the False Discovery Rate (FDR) has appeared for the last ten years as more appropriate to handle this problem. However one drawback of FDR is related to a given rejection region for the considered statistics, attributing the same value to those that are close to the boundary and those that are not. As a result, the local FDR has been recently proposed to quantify the specific probability for a given null hypothesis to be true.

RESULTS

In this context we present a semi-parametric approach based on kernel estimators which is applied to different high-throughput biological data such as patterns in DNA sequences, genes expression and genome-wide association studies.

CONCLUSION

The proposed method has the practical advantages, over existing approaches, to consider complex heterogeneities in the alternative hypothesis, to take into account prior information (from an expert judgment or previous studies) by allowing a semi-supervised mode, and to deal with truncated distributions such as those obtained in Monte-Carlo simulations. This method has been implemented and is available through the R package kerfdr via the CRAN or at (http://stat.genopole.cnrs.fr/software/kerfdr).

摘要

背景

当前高通量遗传、基因组和后基因组数据的使用导致大量统计假设同时得到评估,与此同时也带来了多重检验问题。作为过于保守的族系错误率(FWER)的替代方法,错误发现率(FDR)在过去十年中出现,被认为更适合处理此问题。然而,FDR的一个缺点与所考虑统计量的给定拒绝区域有关,对于接近边界和不接近边界的统计量赋予相同的值。因此,最近有人提出局部错误发现率来量化给定原假设为真的具体概率。

结果

在此背景下,我们提出一种基于核估计器的半参数方法,并将其应用于不同的高通量生物学数据,如DNA序列模式、基因表达和全基因组关联研究。

结论

与现有方法相比,所提出的方法具有实际优势,能够考虑备择假设中的复杂异质性,通过允许半监督模式考虑先验信息(来自专家判断或先前研究),并处理截断分布,如蒙特卡罗模拟中获得的分布。该方法已实现,并可通过CRAN上的R包kerfdr获取,或通过(http://stat.genopole.cnrs.fr/software/kerfdr)获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5f71/2679733/63e140d96cd1/1471-2105-10-84-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验