Lu Songjian, Lu Kevin N, Cheng Shi-Yuan, Hu Bo, Ma Xiaojun, Nystrom Nicholas, Lu Xinghua
Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America.
Department of Neurology, Northwestern Brain Tumor Institute, Center for Genetic Medicine, The Robert H. Lurie Comprehensive Cancer Center, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America.
PLoS Comput Biol. 2015 Aug 28;11(8):e1004257. doi: 10.1371/journal.pcbi.1004257. eCollection 2015 Aug.
An important goal of cancer genomic research is to identify the driving pathways underlying disease mechanisms and the heterogeneity of cancers. It is well known that somatic genome alterations (SGAs) affecting the genes that encode the proteins within a common signaling pathway exhibit mutual exclusivity, in which these SGAs usually do not co-occur in a tumor. With some success, this characteristic has been utilized as an objective function to guide the search for driver mutations within a pathway. However, mutual exclusivity alone is not sufficient to indicate that genes affected by such SGAs are in common pathways. Here, we propose a novel, signal-oriented framework for identifying driver SGAs. First, we identify the perturbed cellular signals by mining the gene expression data. Next, we search for a set of SGA events that carries strong information with respect to such perturbed signals while exhibiting mutual exclusivity. Finally, we design and implement an efficient exact algorithm to solve an NP-hard problem encountered in our approach. We apply this framework to the ovarian and glioblastoma tumor data available at the TCGA database, and perform systematic evaluations. Our results indicate that the signal-oriented approach enhances the ability to find informative sets of driver SGAs that likely constitute signaling pathways.
癌症基因组研究的一个重要目标是确定疾病机制和癌症异质性背后的驱动通路。众所周知,影响共同信号通路中编码蛋白质的基因的体细胞基因组改变(SGA)表现出相互排斥性,即这些SGA通常不会在肿瘤中共存。在一定程度上,这一特征已被用作目标函数来指导寻找通路中的驱动突变。然而,仅相互排斥性不足以表明受此类SGA影响的基因处于共同通路中。在此,我们提出了一种新颖的、以信号为导向的框架来识别驱动SGA。首先,我们通过挖掘基因表达数据来识别受干扰的细胞信号。接下来,我们寻找一组关于此类受干扰信号携带强信息且表现出相互排斥性的SGA事件。最后,我们设计并实现了一种高效的精确算法来解决我们方法中遇到的一个NP难问题。我们将此框架应用于TCGA数据库中可用的卵巢癌和胶质母细胞瘤肿瘤数据,并进行系统评估。我们的结果表明,以信号为导向的方法增强了找到可能构成信号通路的信息丰富的驱动SGA集的能力。