Institute of Statistical Science, Academia Sinica, Taipei, Taiwan; Department of Epidemiology; Department of Biostatistics, Brown University, Providence, Rhode Island; Department of Public Health and Community Medicine, Tufts University, Boston, Massachusetts.
Neuro Oncol. 2017 Jul 1;19(7):940-950. doi: 10.1093/neuonc/now288.
Glioma accounts for 80% of malignant brain tumors, but its etiologic determinants remain elusive. Despite genetic susceptibility loci identified by genome-wide association study (GWAS), the agnostic approach leaves open the possibility that other susceptibility genes remain to be discovered. Here we conduct a gene-centric integrative GWAS (iGWAS) of glioma risk that combines transcriptomics and genetics.
We synthesized a brain transcriptomics dataset (n = 354), a GWAS dataset (n = 4203), and an advanced glioma tumor transcriptomic dataset (n = 483) to conduct an iGWAS. Using the expression quantitative trait loci (eQTL) dataset, we built models to predict gene expression for the GWAS data, based on eQTL genotypes. With the predicted gene expression, iGWAS analyses were performed using a novel statistical method. Gene signature risk score was constructed using a penalized logistic regression model.
A total of 30527 transcripts were analyzed using the iGWAS approach. Four novel glioma susceptibility genes were identified with internal and external validation, including DRD5 (P = 3.0 × 10-79), WDR1 (P = 8.4 × 10-77), NOMO1 (P = 1.3 × 10-25), and PDXDC1 (P = 8.3 × 10-24). The genotype-predicted transcription pattern between cases and controls is consistent with that between tumor and its matched normal tissue. The genotype-based 4-gene signature improved the classification between glioma cases and controls based on age, gender, and population stratification, with area under the receiver operating characteristic curve increasing from 0.77 to 0.85 (P = 8.1 × 10-23).
A new genotype-based gene signature of glioma was identified using a novel iGWAS approach, which integrates multiplatform genomic data as well as different genetic association studies.
神经胶质瘤占恶性脑肿瘤的 80%,但其病因仍不清楚。尽管全基因组关联研究(GWAS)已经确定了遗传易感性位点,但这种未知的方法仍有可能发现其他易感性基因。在这里,我们进行了一项以基因为中心的神经胶质瘤风险综合 GWAS(iGWAS),该研究结合了转录组学和遗传学。
我们综合了一个脑转录组学数据集(n=354)、一个 GWAS 数据集(n=4203)和一个高级神经胶质瘤肿瘤转录组学数据集(n=483)进行 iGWAS。使用表达数量性状基因座(eQTL)数据集,我们根据 eQTL 基因型构建了用于 GWAS 数据的基因表达预测模型。利用预测的基因表达,我们使用一种新的统计方法进行了 iGWAS 分析。使用惩罚逻辑回归模型构建了基因特征风险评分。
共分析了 30527 个转录本。通过内部和外部验证,发现了 4 个新的神经胶质瘤易感性基因,包括 DRD5(P=3.0×10-79)、WDR1(P=8.4×10-77)、NOMO1(P=1.3×10-25)和 PDXDC1(P=8.3×10-24)。病例和对照之间的基因型预测转录模式与肿瘤与其匹配的正常组织之间的转录模式一致。基于基因型的 4 个基因特征改善了基于年龄、性别和人群分层的胶质瘤病例和对照之间的分类,ROC 曲线下面积从 0.77 增加到 0.85(P=8.1×10-23)。
使用一种新的 iGWAS 方法,我们发现了一种新的基于基因型的神经胶质瘤基因特征,该方法整合了多平台基因组数据以及不同的遗传关联研究。