整合多组学数据以挖掘癌症相关基因模块。

Integration of multi-omics data to mine cancer-related gene modules.

作者信息

Li Peng, Guo Maozu, Sun Bo

机构信息

School of Artificial Intelligence, Beijing Normal University, Beijing 100875, P. R. China.

School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 100044, P. R. China.

出版信息

J Bioinform Comput Biol. 2019 Dec;17(6):1950038. doi: 10.1142/S0219720019500380.

DOI:10.1142/S0219720019500380

PMID:32019413

Abstract

The identification of cancer-related genes is a major research goal, with implications for determining the pathogenesis of cancer and identifying biomarkers for early diagnosis and treatment. In this study, by integrating multi-omics data, including gene expression, DNA copy number variation, DNA methylation, transcription factors, miRNA, and lncRNA data, we propose a method for mining cancer-related genes based on network models. First, using random forest-based feature selection method multi-omics data are integrated to identify key regulatory factors that affect gene expression, and then genome-wide regulatory networks are constructed. Next, by comparing the regulatory networks of key candidate genes in variant samples and non-variant samples, a differential expression regulatory network is generated. The differential network contains a collection of abnormal regulatory genes of key candidate genes. Then, by introducing the functional similarity as a distance metric for gene sets, a density-based clustering method is used to mine gene modules related to cancer. We applied this method to LUSC (lung squamous cell carcinoma) and mined cancer-related gene modules composed of 20 genes. GO function and KEGG pathway analyses indicated that the modules were closely related to cancer. A survival analysis was used to verify that the excavated gene modules can effectively distinguish between high- and low-risk groups. Overall, these results suggest that the proposed method can be used to identify cancer-related gene modules, providing a basis for the development of biomarkers for diagnosis and treatment.

摘要

识别癌症相关基因是一个主要的研究目标，对确定癌症的发病机制以及识别早期诊断和治疗的生物标志物具有重要意义。在本研究中，通过整合多组学数据，包括基因表达、DNA拷贝数变异、DNA甲基化、转录因子、miRNA和lncRNA数据，我们提出了一种基于网络模型挖掘癌症相关基因的方法。首先，使用基于随机森林的特征选择方法整合多组学数据，以识别影响基因表达的关键调控因子，然后构建全基因组调控网络。接下来，通过比较变异样本和非变异样本中关键候选基因的调控网络，生成差异表达调控网络。差异网络包含关键候选基因的异常调控基因集合。然后，通过引入功能相似性作为基因集的距离度量，使用基于密度的聚类方法挖掘与癌症相关的基因模块。我们将此方法应用于肺鳞状细胞癌（LUSC），挖掘出由20个基因组成的癌症相关基因模块。基因本体（GO）功能和京都基因与基因组百科全书（KEGG）通路分析表明，这些模块与癌症密切相关。生存分析用于验证挖掘出的基因模块能够有效区分高风险组和低风险组。总体而言，这些结果表明所提出的方法可用于识别癌症相关基因模块，为开发诊断和治疗的生物标志物提供依据。

相似文献

Integration of multi-omics data to mine cancer-related gene modules.

J Bioinform Comput Biol. 2019 Dec;17(6):1950038. doi: 10.1142/S0219720019500380.

Multi-omics analysis at epigenomics and transcriptomics levels reveals prognostic subtypes of lung squamous cell carcinoma.

Biomed Pharmacother. 2020 May;125:109859. doi: 10.1016/j.biopha.2020.109859. Epub 2020 Feb 7.

SUBATOMIC: a SUbgraph BAsed mulTi-OMIcs clustering framework to analyze integrated multi-edge networks.

BMC Bioinformatics. 2022 Sep 5;23(1):363. doi: 10.1186/s12859-022-04908-3.

A prognostic 4-lncRNA expression signature for lung squamous cell carcinoma.

Artif Cells Nanomed Biotechnol. 2018 Sep;46(6):1207-1214. doi: 10.1080/21691401.2017.1366334. Epub 2017 Aug 24.

ICan: an integrated co-alteration network to identify ovarian cancer-related genes.

PLoS One. 2015 Mar 24;10(3):e0116095. doi: 10.1371/journal.pone.0116095. eCollection 2015.

Identification of mutated core cancer modules by integrating somatic mutation, copy number variation, and gene expression data.

BMC Syst Biol. 2013;7 Suppl 2(Suppl 2):S4. doi: 10.1186/1752-0509-7-S2-S4. Epub 2013 Oct 14.

Integrated analysis of lncRNA-miRNA-mRNA ceRNA network in squamous cell carcinoma of tongue.

BMC Cancer. 2019 Aug 7;19(1):779. doi: 10.1186/s12885-019-5983-8.

Inferring RBP-Mediated Regulation in Lung Squamous Cell Carcinoma.

PLoS One. 2016 May 17;11(5):e0155354. doi: 10.1371/journal.pone.0155354. eCollection 2016.

Robust pathway-based multi-omics data integration using directed random walks for survival prediction in multiple cancer studies.

Biol Direct. 2019 Apr 29;14(1):8. doi: 10.1186/s13062-019-0239-8.

Predicting Functional Modules of Liver Cancer Based on Differential Network Analysis.

Interdiscip Sci. 2019 Dec;11(4):636-644. doi: 10.1007/s12539-018-0314-3. Epub 2019 Jan 2.

引用本文的文献

3PNMF-MKL: A non-negative matrix factorization-based multiple kernel learning method for multi-modal data integration and its application to gene signature detection.

Front Genet. 2023 Feb 14;14:1095330. doi: 10.3389/fgene.2023.1095330. eCollection 2023.

Exploration of Hanshi Zufei prescription for treatment of COVID-19 based on network pharmacology.

Chin Herb Med. 2022 Apr;14(2):294-302. doi: 10.1016/j.chmed.2021.06.006. Epub 2022 Mar 31.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

整合多组学数据以挖掘癌症相关基因模块。

Integration of multi-omics data to mine cancer-related gene modules.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献