Salomon Matthew P, Li Wai Lok Sibon, Edlund Christopher K, Morrison John, Fortini Barbara K, Win Aung Ko, Conti David V, Thomas Duncan C, Duggan David, Buchanan Daniel D, Jenkins Mark A, Hopper John L, Gallinger Steven, Le Marchand Loïc, Newcomb Polly A, Casey Graham, Marjoram Paul
Department of Preventive Medicine, Keck School of Medicine of the University of Southern California, Los Angeles, CA, USA.
Department of Molecular Oncology, John Wayne Cancer Institute at Providence Saint John's Health Center, Santa Monica, CA, USA.
BMC Genomics. 2016 Mar 3;17:176. doi: 10.1186/s12864-016-2459-y.
For the last decade the conceptual framework of the Genome-Wide Association Study (GWAS) has dominated the investigation of human disease and other complex traits. While GWAS have been successful in identifying a large number of variants associated with various phenotypes, the overall amount of heritability explained by these variants remains small. This raises the question of how best to follow up on a GWAS, localize causal variants accounting for GWAS hits, and as a consequence explain more of the so-called "missing" heritability. Advances in high throughput sequencing technologies now allow for the efficient and cost-effective collection of vast amounts of fine-scale genomic data to complement GWAS.
We investigate these issues using a colon cancer dataset. After QC, our data consisted of 1993 cases, 899 controls. Using marginal tests of associations, we identify 10 variants distributed among six targeted regions that are significantly associated with colorectal cancer, with eight of the variants being novel to this study. Additionally, we perform so-called 'SNP-set' tests of association and identify two sets of variants that implicate both common and rare variants in the etiology of colorectal cancer.
Here we present a large-scale targeted re-sequencing resource focusing on genomic regions implicated in colorectal cancer susceptibility previously identified in several GWAS, which aims to 1) provide fine-scale targeted sequencing data for fine-mapping and 2) provide data resources to address methodological questions regarding the design of sequencing-based follow-up studies to GWAS. Additionally, we show that this strategy successfully identifies novel variants associated with colorectal cancer susceptibility and can implicate both common and rare variants.
在过去十年中,全基因组关联研究(GWAS)的概念框架主导了人类疾病及其他复杂性状的研究。虽然GWAS已成功鉴定出大量与各种表型相关的变异,但这些变异所解释的遗传力总量仍然很小。这就提出了一个问题,即如何最好地跟进GWAS,定位导致GWAS命中的因果变异,从而解释更多所谓的“缺失”遗传力。高通量测序技术的进步现在允许高效且经济高效地收集大量精细尺度的基因组数据,以补充GWAS。
我们使用一个结肠癌数据集来研究这些问题。经过质量控制后,我们的数据包括1993例病例和899例对照。使用关联的边际检验,我们在六个目标区域中鉴定出10个与结直肠癌显著相关的变异,其中8个变异是本研究中的新发现。此外,我们进行了所谓的“单核苷酸多态性(SNP)集”关联检验,并鉴定出两组变异,这两组变异在结直肠癌的病因中涉及常见和罕见变异。
在此,我们展示了一个大规模的靶向重测序资源,聚焦于先前在多个GWAS中确定的与结直肠癌易感性相关的基因组区域,其目的是:1)提供精细尺度的靶向测序数据用于精细定位;2)提供数据资源以解决关于GWAS测序后续研究设计的方法学问题。此外,我们表明该策略成功鉴定出与结直肠癌易感性相关的新变异,并能涉及常见和罕见变异。