Department of Preventive Medicine, Keck School of Medicine and Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, California, United States of America.
PLoS Genet. 2013 Mar;9(3):e1003419. doi: 10.1371/journal.pgen.1003419. Epub 2013 Mar 28.
Rare variation in protein coding sequence is poorly captured by GWAS arrays and has been hypothesized to contribute to disease heritability. Using the Illumina HumanExome SNP array, we successfully genotyped 191,032 common and rare non-synonymous, splice site, or nonsense variants in a multiethnic sample of 2,984 breast cancer cases, 4,376 prostate cancer cases, and 7,545 controls. In breast cancer, the strongest associations included either SNPs in or gene burden scores for genes LDLRAD1, SLC19A1, FGFBP3, CASP5, MMAB, SLC16A6, and INS-IGF2. In prostate cancer, one of the most associated SNPs was in the gene GPRC6A (rs2274911, Pro91Ser, OR = 0.88, P = 1.3 × 10(-5)) near to a known risk locus for prostate cancer; other suggestive associations were noted in genes such as F13A1, ANXA4, MANSC1, and GP6. For both breast and prostate cancer, several of the most significant associations involving SNPs or gene burden scores (sum of minor alleles) were noted in genes previously reported to be associated with a cancer-related phenotype. However, only one of the associations (rs145889899 in LDLRAD1, p = 2.5 × 10(-7) only seen in African Americans) for overall breast or prostate cancer risk was statistically significant after correcting for multiple comparisons. In addition to breast and prostate cancer, other cancer-related traits were examined (body mass index, PSA level, and alcohol drinking) with a number of known and potentially novel associations described. In general, these findings do not support there being many protein coding variants of moderate to high risk for breast and prostate cancer with odds ratios over a range that is probably required for protein coding variation to play a truly outstanding role in risk heritability. Very large sample sizes will be required to better define the role of rare and less penetrant coding variation in prostate and breast cancer disease genetics.
蛋白质编码序列中的罕见变异在 GWAS 阵列中捕捉效果不佳,据推测其对疾病遗传率有一定影响。我们利用 Illumina HumanExome SNP 阵列,成功对 2984 例乳腺癌病例、4376 例前列腺癌病例和 7545 例对照的 191032 个常见和罕见非同义、剪接位点或无义变异进行了基因分型。在乳腺癌中,最强的关联包括 LDLRAD1、SLC19A1、FGFBP3、CASP5、MMAB、SLC16A6 和 INS-IGF2 基因中的 SNP 或基因负担评分。在前列腺癌中,最相关的 SNP 之一位于 GPRC6A 基因(rs2274911,Pro91Ser,OR=0.88,P=1.3×10(-5))附近,该 SNP 与前列腺癌的已知风险位点接近;还在 F13A1、ANXA4、MANSC1 和 GP6 等基因中发现了其他提示性关联。对于乳腺癌和前列腺癌,在与癌症相关表型相关的基因中,有几个涉及 SNP 或基因负担评分(次要等位基因的总和)的最显著关联。然而,在经过多重比较校正后,只有一个关联(rs145889899 在 LDLRAD1 中的关联,p=2.5×10(-7),仅在非裔美国人中观察到)对总体乳腺癌或前列腺癌风险具有统计学意义。除了乳腺癌和前列腺癌,还检查了其他与癌症相关的特征(体重指数、PSA 水平和饮酒),并描述了一些已知和潜在的新关联。一般来说,这些发现并不支持存在许多具有中度至高度风险的蛋白质编码变异,其优势比在蛋白质编码变异发挥真正突出作用所需的范围内。需要非常大的样本量才能更好地确定罕见和低外显率编码变异在前列腺癌和乳腺癌疾病遗传学中的作用。