Ahmadi Adl Amin, Qian Xiaoning
Department of Computer Science and Engineering, University of South Florida, Tampa, FL 33613, USA.
Department of Computer Science and Engineering, University of South Florida, Tampa, FL 33613, USA; Department of Electrical & Computer Engineering, Texas A&M University, College Station, TX 77843, USA; Department of Pediatrics, University of South Florida, Tampa, FL 33620, USA.
Comput Biol Chem. 2015 Aug;57:3-11. doi: 10.1016/j.compbiolchem.2015.02.010. Epub 2015 Feb 7.
Due to involved disease mechanisms, many complex diseases such as cancer, demonstrate significant heterogeneity with varying behaviors, including different survival time, treatment responses, and recurrence rates. The aim of tumor stratification is to identify disease subtypes, which is an important first step towards precision medicine. Recent advances in profiling a large number of molecular variables such as in The Cancer Genome Atlas (TCGA), have enabled researchers to implement computational methods, including traditional clustering and bi-clustering algorithms, to systematically analyze high-throughput molecular measurements to identify tumor subtypes as well as their corresponding associated biomarkers. In this study we discuss critical issues and challenges in existing computational approaches for tumor stratification. We show that the problem can be formulated as finding densely connected sub-graphs (bi-cliques) in a bipartite graph representation of genomic data. We propose a novel algorithm that takes advantage of prior biology knowledge through a gene-gene interaction network to find such sub-graphs, which helps simultaneously identify both tumor subtypes and their corresponding genetic markers. Our experimental results show that our proposed method outperforms current state-of-the-art methods for tumor stratification.
由于涉及的疾病机制,许多复杂疾病(如癌症)表现出显著的异质性,具有不同的行为,包括不同的生存时间、治疗反应和复发率。肿瘤分层的目的是识别疾病亚型,这是迈向精准医学的重要第一步。诸如癌症基因组图谱(TCGA)等对大量分子变量进行分析的最新进展,使研究人员能够运用计算方法,包括传统的聚类和双聚类算法,来系统地分析高通量分子测量数据,以识别肿瘤亚型及其相应的相关生物标志物。在本研究中,我们讨论了现有肿瘤分层计算方法中的关键问题和挑战。我们表明,可以将该问题表述为在基因组数据的二分图表示中寻找紧密连接的子图(双团)。我们提出了一种新颖的算法,该算法通过基因-基因相互作用网络利用先验生物学知识来寻找此类子图,这有助于同时识别肿瘤亚型及其相应的遗传标记。我们的实验结果表明,我们提出的方法在肿瘤分层方面优于当前的最先进方法。