Hoffman Joshua D, Graff Rebecca E, Emami Nima C, Tai Caroline G, Passarelli Michael N, Hu Donglei, Huntsman Scott, Hadley Dexter, Leong Lancelote, Majumdar Arunabha, Zaitlen Noah, Ziv Elad, Witte John S
Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, United States of America.
Program in Biological and Medical Informatics, University of California San Francisco, San Francisco, CA, United States of America.
PLoS Genet. 2017 Mar 31;13(3):e1006690. doi: 10.1371/journal.pgen.1006690. eCollection 2017 Mar.
Breast cancer is the most common solid organ malignancy and the most frequent cause of cancer death among women worldwide. Previous research has yielded insights into its genetic etiology, but there remains a gap in the understanding of genetic factors that contribute to risk, and particularly in the biological mechanisms by which genetic variation modulates risk. The National Cancer Institute's "Up for a Challenge" (U4C) competition provided an opportunity to further elucidate the genetic basis of the disease. Our group leveraged the seven datasets made available by the U4C organizers and data from the publicly available UK Biobank cohort to examine associations between imputed gene expression and breast cancer risk. In particular, we used reference datasets describing the breast tissue and whole blood transcriptomes to impute expression levels in breast cancer cases and controls. In trans-ethnic meta-analyses of U4C and UK Biobank data, we found significant associations between breast cancer risk and the expression of RCCD1 (joint p-value: 3.6x10-06) and DHODH (p-value: 7.1x10-06) in breast tissue, as well as a suggestive association for ANKLE1 (p-value: 9.3x10-05). Expression of RCCD1 in whole blood was also suggestively associated with disease risk (p-value: 1.2x10-05), as were expression of ACAP1 (p-value: 1.9x10-05) and LRRC25 (p-value: 5.2x10-05). While genome-wide association studies (GWAS) have implicated RCCD1 and ANKLE1 in breast cancer risk, they have not identified the remaining three genes. Among the genetic variants that contributed to the predicted expression of the five genes, we found 23 nominally (p-value < 0.05) associated with breast cancer risk, among which 15 are not in high linkage disequilibrium with risk variants previously identified by GWAS. In summary, we used a transcriptome-based approach to investigate the genetic underpinnings of breast carcinogenesis. This approach provided an avenue for deciphering the functional relevance of genes and genetic variants involved in breast cancer.
乳腺癌是全球女性中最常见的实体器官恶性肿瘤,也是癌症死亡的最常见原因。先前的研究已对其遗传病因有了一定认识,但在导致风险的遗传因素理解方面仍存在差距,尤其是在遗传变异调节风险的生物学机制方面。美国国立癌症研究所的“迎接挑战”(U4C)竞赛提供了进一步阐明该疾病遗传基础的机会。我们团队利用U4C组织者提供的七个数据集以及公开可用的英国生物银行队列数据,来研究推断的基因表达与乳腺癌风险之间的关联。具体而言,我们使用描述乳腺组织和全血转录组的参考数据集来推断乳腺癌病例和对照中的表达水平。在对U4C和英国生物银行数据的跨种族荟萃分析中,我们发现乳腺癌风险与乳腺组织中RCCD1的表达(联合p值:3.6×10⁻⁶)和DHODH的表达(p值:7.1×10⁻⁶)之间存在显著关联,ANKLE1也有提示性关联(p值:9.3×10⁻⁵)。全血中RCCD1的表达也与疾病风险有提示性关联(p值:1.2×10⁻⁵),ACAP1的表达(p值:1.9×10⁻⁵)和LRRC25的表达(p值:5.2×10⁻⁵)也是如此。虽然全基因组关联研究(GWAS)已表明RCCD1和ANKLE1与乳腺癌风险有关,但尚未鉴定出其余三个基因。在导致这五个基因预测表达的遗传变异中,我们发现23个与乳腺癌风险存在名义上的关联(p值<0.05),其中15个与GWAS先前鉴定的风险变异不存在高度连锁不平衡。总之,我们使用基于转录组的方法来研究乳腺癌发生的遗传基础。这种方法为解读参与乳腺癌的基因和遗传变异的功能相关性提供了一条途径。