Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden.
Department of Computer Science and Engineering, Chalmers University of Technology, Gothenburg, Sweden.
PLoS Comput Biol. 2021 Apr 5;17(4):e1008898. doi: 10.1371/journal.pcbi.1008898. eCollection 2021 Apr.
Deregulation of the protein secretory pathway (PSP) is linked to many hallmarks of cancer, such as promoting tissue invasion and modulating cell-cell signaling. The collection of secreted proteins processed by the PSP, known as the secretome, is often studied due to its potential as a reservoir of tumor biomarkers. However, there has been less focus on the protein components of the secretory machinery itself. We therefore investigated the expression changes in secretory pathway components across many different cancer types. Specifically, we implemented a dual approach involving differential expression analysis and machine learning to identify PSP genes whose expression was associated with key tumor characteristics: mutation of p53, cancer status, and tumor stage. Eight different machine learning algorithms were included in the analysis to enable comparison between methods and to focus on signals that were robust to algorithm type. The machine learning approach was validated by identifying PSP genes known to be regulated by p53, and even outperformed the differential expression analysis approach. Among the different analysis methods and cancer types, the kinesin family members KIF20A and KIF23 were consistently among the top genes associated with malignant transformation or tumor stage. However, unlike most cancer types which exhibited elevated KIF20A expression that remained relatively constant across tumor stages, renal carcinomas displayed a more gradual increase that continued with increasing disease severity. Collectively, our study demonstrates the complementary nature of a combined differential expression and machine learning approach for analyzing gene expression data, and highlights key PSP components relevant to features of tumor pathophysiology that may constitute potential therapeutic targets.
蛋白质分泌途径(PSP)的失调与癌症的许多特征有关,例如促进组织侵袭和调节细胞间信号。PSP 处理的分泌蛋白的集合,称为分泌组,由于其作为肿瘤生物标志物库的潜力而经常被研究。然而,对分泌机制本身的蛋白质成分的关注较少。因此,我们研究了许多不同癌症类型中分泌途径成分的表达变化。具体来说,我们采用了差异表达分析和机器学习的双重方法来识别与关键肿瘤特征相关的 PSP 基因的表达变化:p53 突变、癌症状态和肿瘤分期。该分析包括了 8 种不同的机器学习算法,以实现方法之间的比较,并关注对算法类型稳健的信号。通过识别已知受 p53 调节的 PSP 基因,对机器学习方法进行了验证,甚至优于差异表达分析方法。在不同的分析方法和癌症类型中,驱动蛋白家族成员 KIF20A 和 KIF23 一直是与恶性转化或肿瘤分期相关的最重要基因之一。然而,与大多数表现出相对稳定的 KIF20A 表达升高的癌症类型不同,肾细胞癌显示出更渐进的增加,随着疾病严重程度的增加而持续增加。总之,我们的研究表明,联合差异表达和机器学习方法分析基因表达数据具有互补性,并突出了与肿瘤病理生理学特征相关的关键 PSP 成分,这些成分可能构成潜在的治疗靶点。