Nibbe Rod K, Markowitz Sanford, Myeroff Lois, Ewing Rob, Chance Mark R
Department of Pharmacology, Case Western Reserve University, Cleveland, Ohio 44106, USA.
Mol Cell Proteomics. 2009 Apr;8(4):827-45. doi: 10.1074/mcp.M800428-MCP200. Epub 2008 Dec 19.
We used a systems biology approach to identify and score protein interaction subnetworks whose activity patterns are discriminative of late stage human colorectal cancer (CRC) versus control in colonic tissue. We conducted two gel-based proteomics experiments to identify significantly changing proteins between normal and late stage tumor tissues obtained from an adequately sized cohort of human patients. A total of 67 proteins identified by these experiments was used to seed a search for protein-protein interaction subnetworks. A scoring scheme based on mutual information, calculated using gene expression data as a proxy for subnetwork activity, was developed to score the targets in the subnetworks. Based on this scoring, the subnetwork was pruned to identify the specific protein combinations that were significantly discriminative of late stage cancer versus control. These combinations could not be discovered using only proteomics data or by merely clustering the gene expression data. We then analyzed the resultant pruned subnetwork for biological relevance to human CRC. A number of the proteins in these smaller subnetworks have been associated with the progression (CSNK2A2, PLK1, and IGFBP3) or metastatic potential (PDGFRB) of CRC. Others have been recently identified as potential markers of CRC (IFITM1), and the role of others is largely unknown in this disease (CCT3, CCT5, CCT7, and GNA12). The functional interactions represented by these signatures provide new experimental hypotheses that merit follow-on validation for biological significance in this disease. Overall the method outlines a quantitative approach for integrating proteomics data, gene expression data, and the wealth of accumulated legacy experimental data to discover significant protein subnetworks specific to disease.
我们采用系统生物学方法来识别和评估蛋白质相互作用子网,这些子网的活性模式能够区分晚期人类结直肠癌(CRC)与结肠组织中的对照样本。我们进行了两项基于凝胶的蛋白质组学实验,以识别从足够数量的人类患者队列中获取的正常组织与晚期肿瘤组织之间显著变化的蛋白质。通过这些实验鉴定出的总共67种蛋白质被用于启动对蛋白质 - 蛋白质相互作用子网的搜索。我们开发了一种基于互信息的评分方案,使用基因表达数据作为子网活性的代理来计算,以对子网中的靶点进行评分。基于此评分,对子网进行修剪以识别能够显著区分晚期癌症与对照的特定蛋白质组合。仅使用蛋白质组学数据或仅仅对基因表达数据进行聚类无法发现这些组合。然后,我们分析了最终修剪后的子网与人类CRC的生物学相关性。这些较小子网络中的许多蛋白质已被证明与CRC的进展(CSNK2A2、PLK1和IGFBP3)或转移潜力(PDGFRB)相关。其他一些蛋白质最近被确定为CRC的潜在标志物(IFITM1),而其他一些蛋白质在这种疾病中的作用在很大程度上尚不清楚(CCT3、CCT5、CCT7和GNA12)。这些特征所代表的功能相互作用提供了新的实验假设,值得后续验证其在该疾病中的生物学意义。总体而言,该方法概述了一种定量方法,用于整合蛋白质组学数据、基因表达数据以及大量积累的传统实验数据,以发现特定于疾病的重要蛋白质子网。