Suppr超能文献

基于投票的癌症模块识别方法,结合拓扑和数据驱动特性。

Voting-based cancer module identification by combining topological and data-driven properties.

机构信息

School of Information and Communications, Gwangju Institute of Science and Technology, Gwangju, South Korea.

出版信息

PLoS One. 2013 Aug 5;8(8):e70498. doi: 10.1371/journal.pone.0070498. Print 2013.

Abstract

Recently, computational approaches integrating copy number aberrations (CNAs) and gene expression (GE) have been extensively studied to identify cancer-related genes and pathways. In this work, we integrate these two data sets with protein-protein interaction (PPI) information to find cancer-related functional modules. To integrate CNA and GE data, we first built a gene-gene relationship network from a set of seed genes by enumerating all types of pairwise correlations, e.g. GE-GE, CNA-GE, and CNA-CNA, over multiple patients. Next, we propose a voting-based cancer module identification algorithm by combining topological and data-driven properties (VToD algorithm) by using the gene-gene relationship network as a source of data-driven information, and the PPI data as topological information. We applied the VToD algorithm to 266 glioblastoma multiforme (GBM) and 96 ovarian carcinoma (OVC) samples that have both expression and copy number measurements, and identified 22 GBM modules and 23 OVC modules. Among 22 GBM modules, 15, 12, and 20 modules were significantly enriched with cancer-related KEGG, BioCarta pathways, and GO terms, respectively. Among 23 OVC modules, 19, 18, and 23 modules were significantly enriched with cancer-related KEGG, BioCarta pathways, and GO terms, respectively. Similarly, we also observed that 9 and 2 GBM modules and 15 and 18 OVC modules were enriched with cancer gene census (CGC) and specific cancer driver genes, respectively. Our proposed module-detection algorithm significantly outperformed other existing methods in terms of both functional and cancer gene set enrichments. Most of the cancer-related pathways from both cancer data sets found in our algorithm contained more than two types of gene-gene relationships, showing strong positive correlations between the number of different types of relationship and CGC enrichment [Formula: see text]-values (0.64 for GBM and 0.49 for OVC). This study suggests that identified modules containing both expression changes and CNAs can explain cancer-related activities with greater insights.

摘要

最近,整合拷贝数改变(CNAs)和基因表达(GE)的计算方法已经被广泛研究,以鉴定与癌症相关的基因和通路。在这项工作中,我们整合了这两个数据集以及蛋白质-蛋白质相互作用(PPI)信息,以寻找与癌症相关的功能模块。为了整合 CNA 和 GE 数据,我们首先通过枚举多种类型的成对相关性(例如,GE-GE、CNA-GE 和 CNA-CNA),从一组种子基因构建了一个基因-基因关系网络。接下来,我们提出了一种基于投票的癌症模块识别算法(VToD 算法),该算法通过使用基因-基因关系网络作为数据驱动信息的来源,并结合 PPI 数据作为拓扑信息。我们将 VToD 算法应用于 266 例胶质母细胞瘤(GBM)和 96 例卵巢癌(OVC)样本,这些样本都具有表达和拷贝数测量值,鉴定出 22 个 GBM 模块和 23 个 OVC 模块。在 22 个 GBM 模块中,有 15、12 和 20 个模块分别显著富集了与癌症相关的 KEGG、BioCarta 通路和 GO 术语。在 23 个 OVC 模块中,有 19、18 和 23 个模块分别显著富集了与癌症相关的 KEGG、BioCarta 通路和 GO 术语。同样,我们还观察到,9 个和 2 个 GBM 模块以及 15 个和 18 个 OVC 模块分别富集了癌症基因普查(CGC)和特定的癌症驱动基因。我们提出的模块检测算法在功能和癌症基因集富集方面都显著优于其他现有方法。在我们的算法中,来自两个癌症数据集的大多数与癌症相关的通路都包含两种以上的基因-基因关系,表明不同类型关系的数量与 CGC 富集的[公式:见正文]值之间存在很强的正相关(GBM 为 0.64,OVC 为 0.49)。这项研究表明,包含表达变化和 CNA 的鉴定模块可以提供更深入的见解,解释与癌症相关的活动。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3f79/3734239/423d7820ff5c/pone.0070498.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验