使用网络约束无限潜在特征选择改进癌症生物标志物识别

Improved cancer biomarkers identification using network-constrained infinite latent feature selection.

作者信息

Cai Lihua, Wu Honglong, Zhou Ke

机构信息

Wuhan National Laboratory for Optoelectronics, School of Computer Science & Technology, Huazhong University of Science & Technology, Wuhan, Hubei, China.

School of Mathematics and Computer Science, Guangdong Ocean University, Zhanjiang, Guangdong, China.

出版信息

PLoS One. 2021 Feb 11;16(2):e0246668. doi: 10.1371/journal.pone.0246668. eCollection 2021.

DOI:10.1371/journal.pone.0246668

PMID:33571282

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7877636/

Abstract

Identifying biomarkers that are associated with different types of cancer is an important goal in the field of bioinformatics. Different researcher groups have analyzed the expression profiles of many genes and found some certain genetic patterns that can promote the improvement of targeted therapies, but the significance of some genes is still ambiguous. More reliable and effective biomarkers identification methods are then needed to detect candidate cancer-related genes. In this paper, we proposed a novel method that combines the infinite latent feature selection (ILFS) method with the functional interaction (FIs) network to rank the biomarkers. We applied the proposed method to the expression data of five cancer types. The experiments indicated that our network-constrained ILFS (NCILFS) provides an improved prediction of the diagnosis of the samples and locates many more known oncogenes than the original ILFS and some other existing methods. We also performed functional enrichment analysis by inspecting the over-represented gene ontology (GO) biological process (BP) terms and applying the gene set enrichment analysis (GSEA) method on selected biomarkers for each feature selection method. The enrichments analysis reports show that our network-constraint ILFS can produce more biologically significant gene sets than other methods. The results suggest that network-constrained ILFS can identify cancer-related genes with a higher discriminative power and biological significance.

摘要

识别与不同类型癌症相关的生物标志物是生物信息学领域的一个重要目标。不同的研究团队分析了许多基因的表达谱，并发现了一些特定的遗传模式，这些模式有助于推动靶向治疗的改进，但某些基因的意义仍不明确。因此，需要更可靠、有效的生物标志物识别方法来检测候选癌症相关基因。在本文中，我们提出了一种将无限潜在特征选择（ILFS）方法与功能相互作用（FIs）网络相结合的新方法，用于对生物标志物进行排序。我们将所提出的方法应用于五种癌症类型的表达数据。实验表明，我们的网络约束ILFS（NCILFS）对样本诊断的预测能力有所提高，并且比原始的ILFS和其他一些现有方法定位到了更多已知的致癌基因。我们还通过检查过度富集的基因本体（GO）生物过程（BP）术语，并对每种特征选择方法选择的生物标志物应用基因集富集分析（GSEA）方法，进行了功能富集分析。富集分析报告显示，我们的网络约束ILFS比其他方法能产生更具生物学意义的基因集。结果表明，网络约束ILFS能够识别出具有更高判别力和生物学意义的癌症相关基因。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eec6/7877636/89a2d0268824/pone.0246668.g001.jpg

相似文献

Improved cancer biomarkers identification using network-constrained infinite latent feature selection.

PLoS One. 2021 Feb 11;16(2):e0246668. doi: 10.1371/journal.pone.0246668. eCollection 2021.

Network-based inference framework for identifying cancer genes from gene expression data.

Biomed Res Int. 2013;2013:401649. doi: 10.1155/2013/401649. Epub 2013 Sep 1.

Feature genes in metastatic breast cancer identified by MetaDE and SVM classifier methods.

Mol Med Rep. 2018 Mar;17(3):4281-4290. doi: 10.3892/mmr.2018.8398. Epub 2018 Jan 9.

Identifying cancer biomarkers by network-constrained support vector machines.

BMC Syst Biol. 2011 Oct 12;5:161. doi: 10.1186/1752-0509-5-161.

Detecting biomarkers from microarray data using distributed correlation based gene selection.

Genes Genomics. 2020 Apr;42(4):449-465. doi: 10.1007/s13258-020-00916-w. Epub 2020 Feb 10.

Feature selection with the Fisher score followed by the Maximal Clique Centrality algorithm can accurately identify the hub genes of hepatocellular carcinoma.

Sci Rep. 2019 Nov 21;9(1):17283. doi: 10.1038/s41598-019-53471-0.

Analysis of potential genetic biomarkers and molecular mechanism of smoking-related postmenopausal osteoporosis using weighted gene co-expression network analysis and machine learning.

PLoS One. 2021 Sep 23;16(9):e0257343. doi: 10.1371/journal.pone.0257343. eCollection 2021.

Integrated network analysis and machine learning approach for the identification of key genes of triple-negative breast cancer.

J Cell Biochem. 2019 Apr;120(4):6154-6167. doi: 10.1002/jcb.27903. Epub 2018 Oct 9.

Feature selection and survival modeling in The Cancer Genome Atlas.

Int J Nanomedicine. 2013;8 Suppl 1(Suppl 1):57-62. doi: 10.2147/IJN.S40733. Epub 2013 Sep 16.

FRL: An Integrative Feature Selection Algorithm Based on the Fisher Score, Recursive Feature Elimination, and Logistic Regression to Identify Potential Genomic Biomarkers.

Biomed Res Int. 2021 Jun 12;2021:4312850. doi: 10.1155/2021/4312850. eCollection 2021.

本文引用的文献

Infinite Feature Selection: A Graph-based Feature Filtering Approach.

IEEE Trans Pattern Anal Mach Intell. 2021 Dec;43(12):4396-4410. doi: 10.1109/TPAMI.2020.3002843. Epub 2021 Nov 3.

Cancer statistics, 2020.

CA Cancer J Clin. 2020 Jan;70(1):7-30. doi: 10.3322/caac.21590. Epub 2020 Jan 8.

Induction of endoplasmic reticulum stress might be responsible for defective autophagy in cadmium-induced prostate carcinogenesis.

Toxicol Appl Pharmacol. 2019 Jun 15;373:62-68. doi: 10.1016/j.taap.2019.04.012. Epub 2019 Apr 16.

Dysregulation of the Splicing Machinery Is Associated to the Development of Nonalcoholic Fatty Liver Disease.

J Clin Endocrinol Metab. 2019 Aug 1;104(8):3389-3402. doi: 10.1210/jc.2019-00021.

High-throughput RNAi screening reveals cancer-selective lethal targets in the RNA spliceosome.

Oncogene. 2019 May;38(21):4142-4153. doi: 10.1038/s41388-019-0711-z. Epub 2019 Jan 31.

An Improved Method for Prediction of Cancer Prognosis by Network Learning.

Genes (Basel). 2018 Oct 2;9(10):478. doi: 10.3390/genes9100478.

A Deep Learning Approach for Predicting Antidepressant Response in Major Depression Using Clinical and Genetic Biomarkers.

Front Psychiatry. 2018 Jul 6;9:290. doi: 10.3389/fpsyt.2018.00290. eCollection 2018.

Silencing of casein kinase 2 inhibits PKC‑induced cell invasion by targeting MMP‑9 in MCF‑7 cells.

Mol Med Rep. 2018 Jun;17(6):8397-8402. doi: 10.3892/mmr.2018.8885. Epub 2018 Apr 13.

The Reactome Pathway Knowledgebase.

Nucleic Acids Res. 2018 Jan 4;46(D1):D649-D655. doi: 10.1093/nar/gkx1132.

KEGG: new perspectives on genomes, pathways, diseases and drugs.

Nucleic Acids Res. 2017 Jan 4;45(D1):D353-D361. doi: 10.1093/nar/gkw1092. Epub 2016 Nov 28.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用网络约束无限潜在特征选择改进癌症生物标志物识别

Improved cancer biomarkers identification using network-constrained infinite latent feature selection.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献