Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America.
Department of Biochemistry and Molecular Biology, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America.
PLoS One. 2019 Aug 22;14(8):e0221444. doi: 10.1371/journal.pone.0221444. eCollection 2019.
Gene set analysis (GSA) has become the common methodology for analyzing transcriptomics data. However, self-contained GSA techniques are rarely, if ever, used for proteomics data analysis. Here we present a self-contained proteome level GSA of four consensus molecular subtypes (CMSs) previously established by transcriptome dissection of colon carcinoma specimens. Despite notable difference in structure of proteomics and transcriptomics data, many pathway-wide characteristic features of CMSs found at the mRNA level were reproduced at the protein level. In particular, CMS1 features show heavy involvement of immune system as well as the pathways related to mismatch repair, DNA replication and functioning of proteasome, while CMS4 tumors upregulate complement pathway and proteins participating in epithelial-to-mesenchymal transition (EMT). In addition, protein level GSA yielded a set of novel observations visible at the proteome, but not at the transcriptome level, including possible involvement of major histocompatibility complex II (MHC-II) antigens in the known immunogenicity of CMS1 and a connection between cholesterol trafficking and the regulation of Integrin-linked kinase (ILK) in CMS3. Overall, this study proves utility of self-contained GSA approaches as a critical tool for analyzing proteomics data in general and dissecting protein-level molecular portraits of human tumors in particular.
基因集分析(GSA)已成为分析转录组数据的常用方法。然而,自我包含的 GSA 技术很少用于蛋白质组数据分析。在这里,我们提出了一个自我包含的蛋白质组水平的 GSA,分析了先前通过结肠癌细胞标本的转录组分析建立的四个共识分子亚型(CMS)。尽管蛋白质组和转录组数据的结构有明显差异,但在 mRNA 水平上发现的 CMS 特征的许多通路广泛特征在蛋白质水平上得到了重现。特别是,CMS1 特征表现出免疫系统的大量参与,以及与错配修复、DNA 复制和蛋白酶体功能相关的途径,而 CMS4 肿瘤上调补体途径和参与上皮间质转化(EMT)的蛋白质。此外,蛋白质水平 GSA 产生了一组在蛋白质组水平可见但在转录组水平不可见的新观察结果,包括主要组织相容性复合体 II(MHC-II)抗原可能参与 CMS1 的已知免疫原性以及胆固醇转运与 CMS3 中整合素连接激酶(ILK)调节之间的联系。总体而言,这项研究证明了自我包含的 GSA 方法作为分析蛋白质组数据的关键工具的实用性,特别是用于剖析人类肿瘤的蛋白质水平分子特征。