Suppr超能文献

基于经验贝叶斯方法和错误发现率的同时推断在 eQTL 数据分析中的应用。

Simultaneous inferences based on empirical Bayes methods and false discovery rates ineQTL data analysis.

出版信息

BMC Genomics. 2013;14 Suppl 8(Suppl 8):S8. doi: 10.1186/1471-2164-14-S8-S8. Epub 2013 Dec 9.

Abstract

BACKGROUND

Genome-wide association studies (GWAS) have identified hundreds of genetic variants associated with complex human diseases, clinical conditions and traits. Genetic mapping of expression quantitative trait loci (eQTLs) is providing us with novel functional effects of thousands of single nucleotide polymorphisms (SNPs). In a classical quantitative trail loci (QTL) mapping problem multiple tests are done to assess whether one trait is associated with a number of loci. In contrast to QTL studies, thousands of traits are measured alongwith thousands of gene expressions in an eQTL study. For such a study, a huge number of tests have to be performed (~10(6)). This extreme multiplicity gives rise to many computational and statistical problems. In this paper we have tried to address these issues using two closely related inferential approaches: an empirical Bayes method that bears the Bayesian flavor without having much a priori knowledge and the frequentist method of false discovery rates. A three-component t-mixture model has been used for the parametric empirical Bayes (PEB) method. Inferences have been obtained using Expectation/Conditional Maximization Either (ECME) algorithm. A simulation study has also been performed and has been compared with a nonparametric empirical Bayes (NPEB) alternative.

RESULTS

The results show that PEB has an edge over NPEB. The proposed methodology has been applied to human liver cohort (LHC) data. Our method enables to discover more significant SNPs with FDR<10% compared to the previous study done by Yang et al. (Genome Research, 2010).

CONCLUSIONS

In contrast to previously available methods based on p-values, the empirical Bayes method uses local false discovery rate (lfdr) as the threshold. This method controls false positive rate.

摘要

背景

全基因组关联研究(GWAS)已经确定了数百个与复杂人类疾病、临床情况和特征相关的遗传变异。表达数量性状基因座(eQTL)的遗传定位为我们提供了数千个单核苷酸多态性(SNP)的新功能效应。在经典的定量轨迹基因座(QTL)映射问题中,进行了多次测试以评估一个特征是否与多个基因座相关。与 QTL 研究相比,在 eQTL 研究中,数千个特征与数千个基因表达同时被测量。对于这样的研究,必须进行大量的测试(~10(6))。这种极端的多重性带来了许多计算和统计问题。在本文中,我们尝试使用两种密切相关的推断方法来解决这些问题:一种是具有贝叶斯风味但不需要太多先验知识的经验贝叶斯方法,另一种是错误发现率的频率方法。三成分 t 混合模型已用于参数经验贝叶斯(PEB)方法。使用期望/条件最大化算法(ECME)获得推断。还进行了模拟研究,并与非参数经验贝叶斯(NPEB)替代方法进行了比较。

结果

结果表明,PEB 优于 NPEB。所提出的方法已应用于人类肝脏队列(LHC)数据。与 Yang 等人之前的研究相比,我们的方法能够发现更多具有 FDR<10%的显著 SNP。

结论

与基于 p 值的先前可用方法相反,经验贝叶斯方法使用局部错误发现率(lfdr)作为阈值。这种方法控制假阳性率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/92fb/4042241/2e2da50ffc71/1471-2164-14-S8-S8-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验