Suppr超能文献

用于全基因组关联研究的贝叶斯图形模型。

Bayesian graphical models for genomewide association studies.

作者信息

Verzilli Claudio J, Stallard Nigel, Whittaker John C

机构信息

Department of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, UK.

出版信息

Am J Hum Genet. 2006 Jul;79(1):100-12. doi: 10.1086/505313. Epub 2006 May 30.

Abstract

As the extent of human genetic variation becomes more fully characterized, the research community is faced with the challenging task of using this information to dissect the heritable components of complex traits. Genomewide association studies offer great promise in this respect, but their analysis poses formidable difficulties. In this article, we describe a computationally efficient approach to mining genotype-phenotype associations that scales to the size of the data sets currently being collected in such studies. We use discrete graphical models as a data-mining tool, searching for single- or multilocus patterns of association around a causative site. The approach is fully Bayesian, allowing us to incorporate prior knowledge on the spatial dependencies around each marker due to linkage disequilibrium, which reduces considerably the number of possible graphical structures. A Markov chain-Monte Carlo scheme is developed that yields samples from the posterior distribution of graphs conditional on the data from which probabilistic statements about the strength of any genotype-phenotype association can be made. Using data simulated under scenarios that vary in marker density, genotype relative risk of a causative allele, and mode of inheritance, we show that the proposed approach has better localization properties and leads to lower false-positive rates than do single-locus analyses. Finally, we present an application of our method to a quasi-synthetic data set in which data from the CYP2D6 region are embedded within simulated data on 100K single-nucleotide polymorphisms. Analysis is quick (<5 min), and we are able to localize the causative site to a very short interval.

摘要

随着人类遗传变异程度得到更全面的表征,研究界面临着一项具有挑战性的任务,即利用这些信息剖析复杂性状的遗传成分。全基因组关联研究在这方面展现出巨大潜力,但其分析也带来了巨大困难。在本文中,我们描述了一种计算效率高的方法来挖掘基因型与表型之间的关联,该方法能够适应此类研究中当前正在收集的数据集的规模。我们使用离散图形模型作为数据挖掘工具,在致病位点周围搜索单基因座或多基因座的关联模式。该方法是完全贝叶斯的,使我们能够纳入由于连锁不平衡而在每个标记周围的空间依赖性的先验知识,这大大减少了可能的图形结构数量。我们开发了一种马尔可夫链蒙特卡罗方案,该方案根据数据生成图形后验分布的样本,据此可以对任何基因型与表型关联的强度做出概率陈述。使用在标记密度、致病等位基因的基因型相对风险和遗传模式各不相同的情况下模拟的数据,我们表明,与单基因座分析相比,所提出的方法具有更好的定位特性,并且导致的假阳性率更低。最后,我们将我们的方法应用于一个准合成数据集,其中来自CYP2D6区域的数据嵌入在关于10万个单核苷酸多态性的模拟数据中。分析速度很快(<5分钟),并且我们能够将致病位点定位到非常短的区间内。

相似文献

8
Measuring gametic disequilibrium from multilocus data.从多位点数据测量配子不平衡。
Genetics. 2001 Jan;157(1):413-23. doi: 10.1093/genetics/157.1.413.

引用本文的文献

本文引用的文献

1
A haplotype map of the human genome.人类基因组单倍型图谱。
Nature. 2005 Oct 27;437(7063):1299-320. doi: 10.1038/nature04226.
4
Gearing up for genome-wide gene-association studies.为全基因组基因关联研究做好准备。
Hum Mol Genet. 2005 Oct 15;14 Spec No. 2:R157-62. doi: 10.1093/hmg/ddi273.
5
Prospects and pitfalls in whole genome association studies.全基因组关联研究的前景与陷阱
Philos Trans R Soc Lond B Biol Sci. 2005 Aug 29;360(1460):1589-95. doi: 10.1098/rstb.2005.1689.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验