Suppr超能文献

一种新的基于贝叶斯模型平均的全基因组关联研究的变分贝叶斯多基因 Z 统计量。

A novel variational Bayes multiple locus Z-statistic for genome-wide association studies with Bayesian model averaging.

机构信息

Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, 98109, Seattle, WA 98195, USA.

出版信息

Bioinformatics. 2012 Jul 1;28(13):1738-44. doi: 10.1093/bioinformatics/bts261. Epub 2012 May 4.

Abstract

MOTIVATION

For many complex traits, including height, the majority of variants identified by genome-wide association studies (GWAS) have small effects, leaving a significant proportion of the heritable variation unexplained. Although many penalized multiple regression methodologies have been proposed to increase the power to detect associations for complex genetic architectures, they generally lack mechanisms for false-positive control and diagnostics for model over-fitting. Our methodology is the first penalized multiple regression approach that explicitly controls Type I error rates and provide model over-fitting diagnostics through a novel normally distributed statistic defined for every marker within the GWAS, based on results from a variational Bayes spike regression algorithm.

RESULTS

We compare the performance of our method to the lasso and single marker analysis on simulated data and demonstrate that our approach has superior performance in terms of power and Type I error control. In addition, using the Women's Health Initiative (WHI) SNP Health Association Resource (SHARe) GWAS of African-Americans, we show that our method has power to detect additional novel associations with body height. These findings replicate by reaching a stringent cutoff of marginal association in a larger cohort.

AVAILABILITY

An R-package, including an implementation of our variational Bayes spike regression (vBsr) algorithm, is available at http://kooperberg.fhcrc.org/soft.html.

摘要

动机

对于许多复杂特征,包括身高,全基因组关联研究(GWAS)确定的大多数变体具有较小的影响,这使得可遗传变异的很大一部分仍未得到解释。尽管已经提出了许多惩罚性多重回归方法来提高检测复杂遗传结构关联的能力,但它们通常缺乏控制假阳性和模型过拟合的机制。我们的方法是第一个明确控制 I 型错误率的惩罚性多重回归方法,并通过基于变分贝叶斯尖峰回归算法为 GWAS 中的每个标记定义的新正态分布统计量提供模型过拟合诊断。

结果

我们将我们的方法与套索和单标记分析在模拟数据上进行了比较,并证明我们的方法在功效和 I 型错误控制方面具有优越的性能。此外,使用妇女健康倡议(WHI)SNP 健康关联资源(SHARe)GWAS 对非裔美国人进行分析,我们表明我们的方法具有检测身体高度的附加新颖关联的能力。这些发现通过在更大的队列中达到边际关联的严格截止值得到了复制。

可用性

包括我们的变分贝叶斯尖峰回归(vBsr)算法实现的 R 包可在 http://kooperberg.fhcrc.org/soft.html 获得。

相似文献

引用本文的文献

10
Sparse expression bases in cancer reveal tumor drivers.癌症中的稀疏表达基础揭示肿瘤驱动因素。
Nucleic Acids Res. 2015 Feb 18;43(3):1332-44. doi: 10.1093/nar/gku1290. Epub 2015 Jan 12.

本文引用的文献

2
Beyond missing heritability: prediction of complex traits.超越遗传缺失:复杂性状的预测。
PLoS Genet. 2011 Apr;7(4):e1002051. doi: 10.1371/journal.pgen.1002051. Epub 2011 Apr 28.
3
The Bayesian lasso for genome-wide association studies.贝叶斯套索在全基因组关联研究中的应用。
Bioinformatics. 2011 Feb 15;27(4):516-23. doi: 10.1093/bioinformatics/btq688. Epub 2010 Dec 14.
4
A variable selection method for genome-wide association studies.一种全基因组关联研究的变量选择方法。
Bioinformatics. 2011 Jan 1;27(1):1-8. doi: 10.1093/bioinformatics/btq600. Epub 2010 Oct 29.
7
Hints of hidden heritability in GWAS.GWAS 中隐藏遗传力的迹象。
Nat Genet. 2010 Jul;42(7):558-60. doi: 10.1038/ng0710-558.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验