Suppr超能文献

贝叶斯套索在全基因组关联研究中的应用。

The Bayesian lasso for genome-wide association studies.

机构信息

Department of Statistics, Pennsylvania State University, State College, PA 16802, USA.

出版信息

Bioinformatics. 2011 Feb 15;27(4):516-23. doi: 10.1093/bioinformatics/btq688. Epub 2010 Dec 14.

Abstract

MOTIVATION

Despite their success in identifying genes that affect complex disease or traits, current genome-wide association studies (GWASs) based on a single SNP analysis are too simple to elucidate a comprehensive picture of the genetic architecture of phenotypes. A simultaneous analysis of a large number of SNPs, although statistically challenging, especially with a small number of samples, is crucial for genetic modeling.

METHOD

We propose a two-stage procedure for multi-SNP modeling and analysis in GWASs, by first producing a 'preconditioned' response variable using a supervised principle component analysis and then formulating Bayesian lasso to select a subset of significant SNPs. The Bayesian lasso is implemented with a hierarchical model, in which scale mixtures of normal are used as prior distributions for the genetic effects and exponential priors are considered for their variances, and then solved by using the Markov chain Monte Carlo (MCMC) algorithm. Our approach obviates the choice of the lasso parameter by imposing a diffuse hyperprior on it and estimating it along with other parameters and is particularly powerful for selecting the most relevant SNPs for GWASs, where the number of predictors exceeds the number of observations.

RESULTS

The new approach was examined through a simulation study. By using the approach to analyze a real dataset from the Framingham Heart Study, we detected several significant genes that are associated with body mass index (BMI). Our findings support the previous results about BMI-related SNPs and, meanwhile, gain new insights into the genetic control of this trait.

AVAILABILITY

The computer code for the approach developed is available at Penn State Center for Statistical Genetics web site, http://statgen.psu.edu.

摘要

动机

尽管基于单核苷酸多态性(SNP)分析的全基因组关联研究(GWAS)在识别影响复杂疾病或性状的基因方面取得了成功,但它们过于简单,无法阐明表型遗传结构的全貌。尽管统计上具有挑战性,尤其是在样本数量较少的情况下,同时分析大量 SNP 对于遗传建模至关重要。

方法

我们提出了一种用于 GWAS 中多 SNP 建模和分析的两阶段程序,首先使用有监督的主成分分析生成“预处理”响应变量,然后制定贝叶斯套索选择一组重要的 SNP。贝叶斯套索使用分层模型实现,其中正态分布的混合尺度用作遗传效应的先验分布,并且考虑了它们的方差的指数先验,然后使用马尔可夫链蒙特卡罗(MCMC)算法进行求解。我们的方法通过对其施加扩散超先验来避免套索参数的选择,并与其他参数一起对其进行估计,对于选择 GWAS 中最相关的 SNP 特别有效,其中预测因子的数量超过了观测值的数量。

结果

通过模拟研究检验了新方法。通过使用该方法分析来自弗雷明汉心脏研究的真实数据集,我们检测到了几个与体重指数(BMI)相关的显着基因。我们的发现支持了之前关于 BMI 相关 SNP 的结果,同时深入了解了该性状的遗传控制。

可用性

开发的方法的计算机代码可在宾夕法尼亚州立大学统计遗传学中心网站上获得,网址为 http://statgen.psu.edu。

相似文献

1
The Bayesian lasso for genome-wide association studies.贝叶斯套索在全基因组关联研究中的应用。
Bioinformatics. 2011 Feb 15;27(4):516-23. doi: 10.1093/bioinformatics/btq688. Epub 2010 Dec 14.
6
Bayesian LASSO for quantitative trait loci mapping.用于数量性状基因座定位的贝叶斯套索法
Genetics. 2008 Jun;179(2):1045-55. doi: 10.1534/genetics.107.085589. Epub 2008 May 27.

引用本文的文献

6
Discovering candidate SNPs for resilience breeding of red clover.发现红三叶草抗性育种的候选单核苷酸多态性
Front Plant Sci. 2022 Sep 28;13:997860. doi: 10.3389/fpls.2022.997860. eCollection 2022.

本文引用的文献

3
4
Genome-wide association analysis by lasso penalized logistic regression.基于套索惩罚逻辑回归的全基因组关联分析。
Bioinformatics. 2009 Mar 15;25(6):714-21. doi: 10.1093/bioinformatics/btp041. Epub 2009 Jan 28.
7
Bayesian LASSO for quantitative trait loci mapping.用于数量性状基因座定位的贝叶斯套索法
Genetics. 2008 Jun;179(2):1045-55. doi: 10.1534/genetics.107.085589. Epub 2008 May 27.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验