Institute for Collaborative Biotechnologies, University of California Santa Barbara, Santa Barbara, CA 93106-5080, USA.
BMC Bioinformatics. 2012 Jan 18;13:12. doi: 10.1186/1471-2105-13-12.
In a complex disease, the expression of many genes can be significantly altered, leading to the appearance of a differentially expressed "disease module". Some of these genes directly correspond to the disease phenotype, (i.e. "driver" genes), while others represent closely-related first-degree neighbours in gene interaction space. The remaining genes consist of further removed "passenger" genes, which are often not directly related to the original cause of the disease. For prognostic and diagnostic purposes, it is crucial to be able to separate the group of "driver" genes and their first-degree neighbours, (i.e. "core module") from the general "disease module".
We have developed COMBINER: COre Module Biomarker Identification with Network ExploRation. COMBINER is a novel pathway-based approach for selecting highly reproducible discriminative biomarkers. We applied COMBINER to three benchmark breast cancer datasets for identifying prognostic biomarkers. COMBINER-derived biomarkers exhibited 10-fold higher reproducibility than other methods, with up to 30-fold greater enrichment for known cancer-related genes, and 4-fold enrichment for known breast cancer susceptible genes. More than 50% and 40% of the resulting biomarkers were cancer and breast cancer specific, respectively. The identified modules were overlaid onto a map of intracellular pathways that comprehensively highlighted the hallmarks of cancer. Furthermore, we constructed a global regulatory network intertwining several functional clusters and uncovered 13 confident "driver" genes of breast cancer metastasis.
COMBINER can efficiently and robustly identify disease core module genes and construct their associated regulatory network. In the same way, it is potentially applicable in the characterization of any disease that can be probed with microarrays.
在复杂疾病中,许多基因的表达可以显著改变,导致出现差异表达的“疾病模块”。这些基因中的一些直接对应于疾病表型(即“驱动”基因),而另一些则代表基因相互作用空间中密切相关的一级邻居。其余的基因包括进一步远离的“乘客”基因,它们通常与疾病的原始原因没有直接关系。为了预后和诊断目的,能够将“驱动”基因及其一级邻居(即“核心模块”)与一般“疾病模块”分开是至关重要的。
我们开发了 COMBINER:基于网络探索的核心模块生物标志物识别。COMBINER 是一种新的基于途径的方法,用于选择高度可重复的有区别的生物标志物。我们将 COMBINER 应用于三个基准乳腺癌数据集,以识别预后生物标志物。COMBINER 衍生的生物标志物表现出 10 倍以上的可重复性,对已知的癌症相关基因的富集程度高达 30 倍,对已知的乳腺癌易感基因的富集程度高达 4 倍。超过 50%和 40%的候选生物标志物分别为癌症和乳腺癌特异性。识别出的模块被叠加到细胞内途径图上,全面突出了癌症的标志。此外,我们构建了一个全局调控网络,交织了几个功能簇,并发现了 13 个有信心的乳腺癌转移“驱动”基因。
COMBINER 可以有效地识别疾病核心模块基因,并构建其相关的调控网络。同样,它有可能应用于任何可以通过微阵列探测的疾病的特征描述。