Lindström Sara, Ablorh Akweley, Chapman Brad, Gusev Alexander, Chen Gary, Turman Constance, Eliassen A Heather, Price Alkes L, Henderson Brian E, Le Marchand Loic, Hofmann Oliver, Haiman Christopher A, Kraft Peter
Department of Epidemiology, University of Washington, 1959 N.E. Pacific Street, Health Sciences Building, Room F247B, Seattle, WA, 98195, USA.
Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA.
Breast Cancer Res. 2016 Nov 5;18(1):109. doi: 10.1186/s13058-016-0772-7.
Although genome-wide association studies (GWASs) have identified thousands of disease susceptibility regions, the underlying causal mechanism in these regions is not fully known. It is likely that the GWAS signal originates from one or many as yet unidentified causal variants.
Using next-generation sequencing, we characterized 12 breast cancer susceptibility regions identified by GWASs in 2288 breast cancer cases and 2323 controls across four populations of African American, European, Japanese, and Hispanic ancestry.
After genotype calling and quality control, we identified 137,530 single-nucleotide variants (SNVs); of those, 87.2 % had a minor allele frequency (MAF) <0.005. For SNVs with MAF >0.005, we calculated the smallest number of SNVs needed to obtain a posterior probability set (PPS) such that there is 90 % probability that the causal SNV is included. We found that the PPS for two regions, 2q35 and 11q13, contained less than 5 % of the original SNVs, dramatically decreasing the number of potentially causal SNVs. However, we did not find strong evidence supporting a causal role for any individual SNV. In addition, there were no significant gene-based rare SNV associations after correcting for multiple testing.
This study illustrates some of the challenges faced in fine-mapping studies in the post-GWAS era, most importantly the large sample sizes needed to identify rare-variant associations or to distinguish the effects of strongly correlated common SNVs.
尽管全基因组关联研究(GWAS)已鉴定出数千个疾病易感区域,但这些区域潜在的因果机制仍不完全清楚。GWAS信号很可能源自一个或多个尚未确定的因果变异。
我们使用下一代测序技术,对GWAS鉴定出的12个乳腺癌易感区域进行了特征分析,研究对象包括2288例乳腺癌病例和2323例对照,涵盖非裔美国人、欧洲人、日本人及西班牙裔四个族群。
在进行基因型分型和质量控制后,我们鉴定出137,530个单核苷酸变异(SNV);其中,87.2%的次要等位基因频率(MAF)<0.005。对于MAF>0.005的SNV,我们计算了获得后验概率集(PPS)所需的最小SNV数量,使得因果SNV被包含的概率为90%。我们发现,2q35和11q13这两个区域的PPS包含的原始SNV不到5%,显著减少了潜在因果SNV的数量。然而,我们没有找到有力证据支持任何单个SNV具有因果作用。此外,在进行多重检验校正后,没有发现基于基因的罕见SNV关联具有统计学意义。
本研究说明了GWAS后时代精细定位研究面临的一些挑战,最重要的是需要大样本量来鉴定罕见变异关联或区分强相关常见SNV的效应。