Jabs Verena, Edlund Karolina, König Helena, Grinberg Marianna, Madjar Katrin, Rahnenführer Jörg, Ekman Simon, Bergkvist Michael, Holmberg Lars, Ickstadt Katja, Botling Johan, Hengstler Jan G, Micke Patrick
Faculty of Statistics, TU Dortmund University, Dortmund, Germany.
Leibniz Research Centre for Working Environment and Human Factors (IfADo) at Dortmund University, Dortmund, Germany.
PLoS One. 2017 Nov 7;12(11):e0187246. doi: 10.1371/journal.pone.0187246. eCollection 2017.
Non-small cell lung cancer (NSCLC) represents a genomically unstable cancer type with extensive copy number aberrations. The relationship of gene copy number alterations and subsequent mRNA levels has only fragmentarily been described. The aim of this study was to conduct a genome-wide analysis of gene copy number gains and corresponding gene expression levels in a clinically well annotated NSCLC patient cohort (n = 190) and their association with survival. While more than half of all analyzed gene copy number-gene expression pairs showed statistically significant correlations (10,296 of 18,756 genes), high correlations, with a correlation coefficient >0.7, were obtained only in a subset of 301 genes (1.6%), including KRAS, EGFR and MDM2. Higher correlation coefficients were associated with higher copy number and expression levels. Strong correlations were frequently based on few tumors with high copy number gains and correspondingly increased mRNA expression. Among the highly correlating genes, GO groups associated with posttranslational protein modifications were particularly frequent, including ubiquitination and neddylation. In a meta-analysis including 1,779 patients we found that survival associated genes were overrepresented among highly correlating genes (61 of the 301 highly correlating genes, FDR adjusted p<0.05). Among them are the chaperone CCT2, the core complex protein NUP107 and the ubiquitination and neddylation associated protein CAND1. In conclusion, in a comprehensive analysis we described a distinct set of highly correlating genes. These genes were found to be overrepresented among survival-associated genes based on gene expression in a large collection of publicly available datasets.
非小细胞肺癌(NSCLC)是一种基因组不稳定的癌症类型,具有广泛的拷贝数畸变。基因拷贝数改变与随后的mRNA水平之间的关系仅得到了部分描述。本研究的目的是在一个临床注释良好的NSCLC患者队列(n = 190)中对基因拷贝数增加和相应的基因表达水平进行全基因组分析,并研究它们与生存的关联。虽然所有分析的基因拷贝数-基因表达对中超过一半显示出统计学上的显著相关性(18756个基因中的10296个),但仅在301个基因的子集中(1.6%)获得了高相关性,相关系数>0.7,包括KRAS、EGFR和MDM2。较高的相关系数与较高的拷贝数和表达水平相关。强相关性通常基于少数具有高拷贝数增加和相应增加的mRNA表达的肿瘤。在高度相关的基因中,与蛋白质翻译后修饰相关的GO组特别常见,包括泛素化和类泛素化。在一项纳入1779例患者的荟萃分析中,我们发现生存相关基因在高度相关的基因中过度富集(301个高度相关基因中的61个,FDR校正p<0.05)。其中包括伴侣蛋白CCT2、核心复合体蛋白NUP107以及与泛素化和类泛素化相关的蛋白CAND1。总之,在一项综合分析中,我们描述了一组独特的高度相关基因。基于大量公开可用数据集中的基因表达,这些基因在生存相关基因中过度富集。