Kerfdr：一种基于半参数核的局部错误发现率估计方法。

Kerfdr: a semi-parametric kernel-based approach to local false discovery rate estimation.

作者信息

Guedj Mickael, Robin Stephane, Celisse Alain, Nuel Gregory

机构信息

Statistics and Genome laboratory, CNRS UMR8071, INRA U1152, University of Evry, Evry, France.

出版信息

BMC Bioinformatics. 2009 Mar 16;10:84. doi: 10.1186/1471-2105-10-84.

DOI:10.1186/1471-2105-10-84

PMID:19291295

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2679733/

Abstract

BACKGROUND

The use of current high-throughput genetic, genomic and post-genomic data leads to the simultaneous evaluation of a large number of statistical hypothesis and, at the same time, to the multiple-testing problem. As an alternative to the too conservative Family-Wise Error-Rate (FWER), the False Discovery Rate (FDR) has appeared for the last ten years as more appropriate to handle this problem. However one drawback of FDR is related to a given rejection region for the considered statistics, attributing the same value to those that are close to the boundary and those that are not. As a result, the local FDR has been recently proposed to quantify the specific probability for a given null hypothesis to be true.

RESULTS

In this context we present a semi-parametric approach based on kernel estimators which is applied to different high-throughput biological data such as patterns in DNA sequences, genes expression and genome-wide association studies.

CONCLUSION

The proposed method has the practical advantages, over existing approaches, to consider complex heterogeneities in the alternative hypothesis, to take into account prior information (from an expert judgment or previous studies) by allowing a semi-supervised mode, and to deal with truncated distributions such as those obtained in Monte-Carlo simulations. This method has been implemented and is available through the R package kerfdr via the CRAN or at (http://stat.genopole.cnrs.fr/software/kerfdr).

摘要

背景

当前高通量遗传、基因组和后基因组数据的使用导致大量统计假设同时得到评估，与此同时也带来了多重检验问题。作为过于保守的族系错误率（FWER）的替代方法，错误发现率（FDR）在过去十年中出现，被认为更适合处理此问题。然而，FDR的一个缺点与所考虑统计量的给定拒绝区域有关，对于接近边界和不接近边界的统计量赋予相同的值。因此，最近有人提出局部错误发现率来量化给定原假设为真的具体概率。

结果

在此背景下，我们提出一种基于核估计器的半参数方法，并将其应用于不同的高通量生物学数据，如DNA序列模式、基因表达和全基因组关联研究。

结论

与现有方法相比，所提出的方法具有实际优势，能够考虑备择假设中的复杂异质性，通过允许半监督模式考虑先验信息（来自专家判断或先前研究），并处理截断分布，如蒙特卡罗模拟中获得的分布。该方法已实现，并可通过CRAN上的R包kerfdr获取，或通过（http://stat.genopole.cnrs.fr/software/kerfdr）获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5f71/2679733/63e140d96cd1/1471-2105-10-84-1.jpg

相似文献

Kerfdr: a semi-parametric kernel-based approach to local false discovery rate estimation.Kerfdr：一种基于半参数核的局部错误发现率估计方法。

BMC Bioinformatics. 2009 Mar 16;10:84. doi: 10.1186/1471-2105-10-84.

Rank-invariant resampling based estimation of false discovery rate for analysis of small sample microarray data.基于秩不变重采样的小样本微阵列数据分析中错误发现率估计

BMC Bioinformatics. 2005 Jul 22;6:187. doi: 10.1186/1471-2105-6-187.

ExactFDR: exact computation of false discovery rate estimate in case-control association studies.精确错误发现率：病例对照关联研究中错误发现率估计值的精确计算。

Bioinformatics. 2008 Oct 15;24(20):2407-8. doi: 10.1093/bioinformatics/btn379. Epub 2008 Jul 28.

Improved estimation of the noncentrality parameter distribution from a large number of t-statistics, with applications to false discovery rate estimation in microarray data analysis.基于大量t统计量对非中心参数分布的改进估计及其在微阵列数据分析中错误发现率估计的应用

Biometrics. 2012 Dec;68(4):1178-87. doi: 10.1111/j.1541-0420.2012.01764.x. Epub 2012 May 2.

Multiple testing in large-scale genetic studies.大规模基因研究中的多重检验。

Methods Mol Biol. 2012;888:213-33. doi: 10.1007/978-1-61779-870-2_13.

A note on using permutation-based false discovery rate estimates to compare different analysis methods for microarray data.关于使用基于排列的错误发现率估计来比较微阵列数据不同分析方法的说明。

Bioinformatics. 2005 Dec 1;21(23):4280-8. doi: 10.1093/bioinformatics/bti685. Epub 2005 Sep 27.

Robust estimation of the false discovery rate.错误发现率的稳健估计

Bioinformatics. 2006 Aug 15;22(16):1979-87. doi: 10.1093/bioinformatics/btl328. Epub 2006 Jun 15.

FDR made easy in differential feature discovery and correlation analyses.在差异特征发现和相关性分析中，FDR变得简单易行。

Bioinformatics. 2009 Jun 1;25(11):1461-2. doi: 10.1093/bioinformatics/btp176. Epub 2009 Apr 17.

Simple estimators of false discovery rates given as few as one or two p-values without strong parametric assumptions.在几乎没有强参数假设的情况下，仅根据一两个p值就能得出的简单错误发现率估计值。

Stat Appl Genet Mol Biol. 2013 Aug;12(4):529-43. doi: 10.1515/sagmb-2013-0003.

Multidimensional local false discovery rate for microarray studies.微阵列研究的多维局部错误发现率

Bioinformatics. 2006 Mar 1;22(5):556-65. doi: 10.1093/bioinformatics/btk013. Epub 2005 Dec 20.

引用本文的文献

Mitochondrial Transcriptome Control and Intercompartment Cross-Talk During Plant Development.线粒体转录组调控与植物发育过程中的细胞器间交流。

Cells. 2019 Jun 13;8(6):583. doi: 10.3390/cells8060583.

Non-parametric estimation of survival in age-dependent genetic disease and application to the transthyretin-related hereditary amyloidosis.年龄相关遗传性疾病中的生存非参数估计及其在转甲状腺素相关遗传性淀粉样变性中的应用。

PLoS One. 2018 Sep 25;13(9):e0203860. doi: 10.1371/journal.pone.0203860. eCollection 2018.

A two-stage hidden Markov model design for biomarker detection, with application to microbiome research.一种用于生物标志物检测的两阶段隐马尔可夫模型设计及其在微生物组研究中的应用。

Stat Biosci. 2018 Apr;10(1):41-58. doi: 10.1007/s12561-017-9187-y. Epub 2017 Feb 10.

Local false discovery rate estimation using feature reliability in LC/MS metabolomics data.利用液相色谱-质谱代谢组学数据中的特征可靠性进行局部错误发现率估计

Sci Rep. 2015 Nov 24;5:17221. doi: 10.1038/srep17221.

Genotype by watering regime interaction in cultivated tomato: lessons from linkage mapping and gene expression.栽培番茄中基因型与水分管理方式的相互作用：连锁图谱绘制与基因表达研究所得经验

Theor Appl Genet. 2016 Feb;129(2):395-418. doi: 10.1007/s00122-015-2635-5. Epub 2015 Nov 18.

Network-based modular latent structure analysis.基于网络的模块化潜在结构分析。

BMC Bioinformatics. 2014;15 Suppl 13(Suppl 13):S6. doi: 10.1186/1471-2105-15-S13-S6. Epub 2014 Nov 13.

AAPL: Assessing Association between P-value Lists.AAPL：评估P值列表之间的关联。

Stat Anal Data Min. 2013 Apr 1;6(2):144-155. doi: 10.1002/sam.11180.

Empirical null distribution based modeling of multi-class differential gene expression detection.基于经验零分布的多类差异基因表达检测建模

J Appl Stat. 2013 Feb 1;40(2):347-357. doi: 10.1080/02664763.2012.743976. Epub 2012 Nov 21.

Molecular apocrine differentiation is a common feature of breast cancer in patients with germline PTEN mutations.分子大汗腺样分化是种系 PTEN 突变患者乳腺癌的一个常见特征。

Breast Cancer Res. 2010;12(4):R63. doi: 10.1186/bcr2626. Epub 2010 Aug 16.

本文引用的文献

The Benjamini-Hochberg method in the case of discrete test statistics.离散检验统计量情形下的本雅明尼-霍赫贝格方法。

Int J Biostat. 2007;3(1):Article 11. doi: 10.2202/1557-4679.1065.

Universal false discovery rate estimation methodology for genome-wide association studies.全基因组关联研究的通用错误发现率估计方法。

Hum Hered. 2008;65(4):183-94. doi: 10.1159/000112365. Epub 2007 Dec 11.

A tutorial on statistical methods for population association studies.群体关联研究统计方法教程。

Nat Rev Genet. 2006 Oct;7(10):781-91. doi: 10.1038/nrg1916.

Robust estimation of the false discovery rate.错误发现率的稳健估计

Bioinformatics. 2006 Aug 15;22(16):1979-87. doi: 10.1093/bioinformatics/btl328. Epub 2006 Jun 15.

Estimation and control of multiple testing error rates for microarray studies.微阵列研究中多重检验错误率的估计与控制。

Brief Bioinform. 2006 Mar;7(1):25-36. doi: 10.1093/bib/bbk002.

A simple implementation of a normal mixture approach to differential gene expression in multiclass microarrays.一种用于多类微阵列中差异基因表达的正态混合方法的简单实现。

Bioinformatics. 2006 Jul 1;22(13):1608-15. doi: 10.1093/bioinformatics/btl148. Epub 2006 Apr 21.

Genotyping over 100,000 SNPs on a pair of oligonucleotide arrays.在一对寡核苷酸阵列上对超过10万个单核苷酸多态性进行基因分型。

Nat Methods. 2004 Nov;1(2):109-11. doi: 10.1038/nmeth718.

VarMixt: efficient variance modelling for the differential analysis of replicated gene expression data.VarMixt：用于重复基因表达数据差异分析的高效方差建模

Bioinformatics. 2005 Feb 15;21(4):502-8. doi: 10.1093/bioinformatics/bti023. Epub 2004 Sep 16.

Determination of the differentially expressed genes in microarray experiments using local FDR.使用局部错误发现率确定微阵列实验中的差异表达基因。

BMC Bioinformatics. 2004 Sep 6;5:125. doi: 10.1186/1471-2105-5-125.

A mixture model for estimating the local false discovery rate in DNA microarray analysis.一种用于估计DNA微阵列分析中局部错误发现率的混合模型。

Bioinformatics. 2004 Nov 1;20(16):2694-701. doi: 10.1093/bioinformatics/bth310. Epub 2004 May 14.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

Kerfdr：一种基于半参数核的局部错误发现率估计方法。

Kerfdr: a semi-parametric kernel-based approach to local false discovery rate estimation.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献