Saccone Nancy L, Saccone Scott F, Goate Alison M, Grucza Richard A, Hinrichs Anthony L, Rice John P, Bierut Laura J
Department of Genetics, Washington University, Campus Box 8232, 4566 Scott Avenue, Saint Louis, Missouri, USA.
BMC Genet. 2008 Aug 29;9:58. doi: 10.1186/1471-2156-9-58.
Genome-wide association (GWA) using large numbers of single nucleotide polymorphisms (SNPs) is now a powerful, state-of-the-art approach to mapping human disease genes. When a GWA study detects association between a SNP and the disease, this signal usually represents association with a set of several highly correlated SNPs in strong linkage disequilibrium. The challenge we address is to distinguish among these correlated loci to highlight potential functional variants and prioritize them for follow-up.
We implemented a systematic method for testing association across diverse population samples having differing histories and LD patterns, using a logistic regression framework. The hypothesis is that important underlying biological mechanisms are shared across human populations, and we can filter correlated variants by testing for heterogeneity of genetic effects in different population samples. This approach formalizes the descriptive comparison of p-values that has typified similar cross-population fine-mapping studies to date. We applied this method to correlated SNPs in the cholinergic nicotinic receptor gene cluster CHRNA5-CHRNA3-CHRNB4, in a case-control study of cocaine dependence composed of 504 European-American and 583 African-American samples. Of the 10 SNPs genotyped in the r2 > or = 0.8 bin for rs16969968, three demonstrated significant cross-population heterogeneity and are filtered from priority follow-up; the remaining SNPs include rs16969968 (heterogeneity p = 0.75). Though the power to filter out rs16969968 is reduced due to the difference in allele frequency in the two groups, the results nevertheless focus attention on a smaller group of SNPs that includes the non-synonymous SNP rs16969968, which retains a similar effect size (odds ratio) across both population samples.
Filtering out SNPs that demonstrate cross-population heterogeneity enriches for variants more likely to be important and causative. Our approach provides an important and effective tool to help interpret results from the many GWA studies now underway.
利用大量单核苷酸多态性(SNP)进行全基因组关联(GWA)研究,如今已成为一种强大的、最先进的人类疾病基因定位方法。当一项GWA研究检测到一个SNP与疾病之间存在关联时,这个信号通常代表与一组处于强连锁不平衡状态的几个高度相关的SNP存在关联。我们要解决的挑战是在这些相关位点中进行区分,以突出潜在的功能变异,并将其列为后续研究的重点。
我们采用逻辑回归框架,实施了一种系统方法,用于在具有不同历史和连锁不平衡模式的不同人群样本中测试关联。我们的假设是,重要的潜在生物学机制在人类群体中是共有的,并且我们可以通过测试不同人群样本中遗传效应的异质性来筛选相关变异。这种方法使对p值的描述性比较形式化,而这种比较是迄今为止类似跨人群精细定位研究的典型特征。在一项由504名欧美人和583名非裔美国人样本组成的可卡因依赖病例对照研究中,我们将这种方法应用于胆碱能烟碱受体基因簇CHRNA5-CHRNA3-CHRNB4中的相关SNP。对于rs16969968,在r2≥0.8区间内进行基因分型的10个SNP中,有3个显示出显著的跨人群异质性,并从优先后续研究中被筛选出来;其余的SNP包括rs16969968(异质性p = 0.75)。尽管由于两组等位基因频率的差异,筛选出rs16969968的能力有所降低,但结果仍然将注意力集中在一小部分SNP上,其中包括非同义SNP rs16969968,它在两个人群样本中保持了相似的效应大小(优势比)。
筛选出显示跨人群异质性的SNP,可以富集更可能重要且具有因果关系的变异。我们的方法提供了一个重要且有效的工具,有助于解释目前正在进行的许多GWA研究的结果。