Head S Taylor, Dezem Felipe, Todor Andrei, Yang Jingjing, Plummer Jasmine, Gayther Simon, Kar Siddhartha, Schildkraut Joellen, Epstein Michael P
Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA.
Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA.
bioRxiv. 2023 Nov 13:2023.11.09.566218. doi: 10.1101/2023.11.09.566218.
Transcriptome-wide association studies (TWAS) have investigated the role of genetically regulated transcriptional activity in the etiologies of breast and ovarian cancer. However, methods performed to date have only considered regulatory effects of risk associated SNPs thought to act in on a nearby target gene. With growing evidence for distal () regulatory effects of variants on gene expression, we performed TWAS of breast and ovarian cancer using a Bayesian genome-wide TWAS method (BGW-TWAS) that considers effects of both - and -expression quantitative trait loci (eQTLs). We applied BGW-TWAS to whole genome and RNA sequencing data in breast and ovarian tissues from the Genotype-Tissue Expression project to train expression imputation models. We applied these models to large-scale GWAS summary statistic data from the Breast Cancer and Ovarian Cancer Association Consortia to identify genes associated with risk of overall breast cancer, non-mucinous epithelial ovarian cancer, and 10 cancer subtypes. We identified 101 genes significantly associated with risk with breast cancer phenotypes and 8 with ovarian phenotypes. These loci include established risk genes and several novel candidate risk loci, such as , whose associations are predominantly driven by -eQTLs. We replicated several associations using summary statistics from an independent GWAS of these cancer phenotypes. We further used genotype and expression data in normal and tumor breast tissue from the Cancer Genome Atlas to examine the performance of our trained expression imputation models. This work represents a first look into the role of eQTLs in the complex molecular mechanisms underlying these diseases.
全转录组关联研究(TWAS)已对基因调控的转录活性在乳腺癌和卵巢癌病因中的作用进行了调查。然而,迄今为止所采用的方法仅考虑了被认为作用于附近靶基因的风险相关单核苷酸多态性(SNP)的调控效应。随着越来越多的证据表明变异对基因表达存在远端(顺式和反式)调控效应,我们使用一种贝叶斯全基因组TWAS方法(BGW-TWAS)对乳腺癌和卵巢癌进行了TWAS研究,该方法同时考虑了顺式和反式表达数量性状位点(eQTL)的效应。我们将BGW-TWAS应用于基因型-组织表达项目中乳腺癌和卵巢癌组织的全基因组和RNA测序数据,以训练表达预测模型。我们将这些模型应用于来自乳腺癌和卵巢癌协会联盟的大规模全基因组关联研究(GWAS)汇总统计数据,以识别与总体乳腺癌、非黏液性上皮性卵巢癌及10种癌症亚型风险相关的基因。我们鉴定出101个与乳腺癌表型风险显著相关的基因以及8个与卵巢癌表型风险显著相关的基因。这些位点包括已确定的风险基因和几个新的候选风险位点,如 ,其关联主要由反式eQTL驱动。我们使用这些癌症表型独立GWAS的汇总统计数据重复了多项关联研究。我们进一步利用癌症基因组图谱中正常和肿瘤乳腺组织的基因型和表达数据来检验我们训练的表达预测模型的性能。这项工作首次探究了eQTL在这些疾病复杂分子机制中的作用。