Suppr
超能文献

基于特征选择和半监督学习的基因表达谱数据癌症生物标志物识别

Identifying Cancer Biomarkers From Microarray Data Using Feature Selection and Semisupervised Learning.

出版信息

IEEE J Transl Eng Health Med. 2014 Dec 2;2:4300211. doi: 10.1109/JTEHM.2014.2375820. eCollection 2014.

DOI:10.1109/JTEHM.2014.2375820

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4848046/

Abstract

Microarrays have now gone from obscurity to being almost ubiquitous in biological research. At the same time, the statistical methodology for microarray analysis has progressed from simple visual assessments of results to novel algorithms for analyzing changes in expression profiles. In a micro-RNA (miRNA) or gene-expression profiling experiment, the expression levels of thousands of genes/miRNAs are simultaneously monitored to study the effects of certain treatments, diseases, and developmental stages on their expressions. Microarray-based gene expression profiling can be used to identify genes, whose expressions are changed in response to pathogens or other organisms by comparing gene expression in infected to that in uninfected cells or tissues. Recent studies have revealed that patterns of altered microarray expression profiles in cancer can serve as molecular biomarkers for tumor diagnosis, prognosis of disease-specific outcomes, and prediction of therapeutic responses. Microarray data sets containing expression profiles of a number of miRNAs or genes are used to identify biomarkers, which have dysregulation in normal and malignant tissues. However, small sample size remains a bottleneck to design successful classification methods. On the other hand, adequate number of microarray data that do not have clinical knowledge can be employed as additional source of information. In this paper, a combination of kernelized fuzzy rough set (KFRS) and semisupervised support vector machine (S(3)VM) is proposed for predicting cancer biomarkers from one miRNA and three gene expression data sets. Biomarkers are discovered employing three feature selection methods, including KFRS. The effectiveness of the proposed KFRS and S(3)VM combination on the microarray data sets is demonstrated, and the cancer biomarkers identified from miRNA data are reported. Furthermore, biological significance tests are conducted for miRNA cancer biomarkers.

摘要

微阵列已经从默默无闻发展到几乎在生物学研究中无处不在。与此同时，微阵列分析的统计方法已经从简单的结果视觉评估发展到用于分析表达谱变化的新算法。在 micro-RNA (miRNA) 或基因表达谱分析实验中，同时监测数千个基因/miRNA 的表达水平，以研究特定处理、疾病和发育阶段对其表达的影响。基于微阵列的基因表达谱分析可用于通过比较感染细胞或组织与未感染细胞或组织中的基因表达来识别因病原体或其他生物体而改变表达的基因。最近的研究表明，癌症中改变的微阵列表达谱模式可以作为肿瘤诊断、疾病特异性结局的预后以及治疗反应预测的分子生物标志物。使用包含多个 miRNA 或基因表达谱的微阵列数据集来识别在正常和恶性组织中失调的生物标志物。然而，小样本量仍然是设计成功分类方法的瓶颈。另一方面，可以利用具有临床知识的大量微阵列数据作为额外的信息来源。在本文中，提出了一种基于核模糊粗糙集 (KFRS) 和半监督支持向量机 (S(3)VM) 的组合方法，用于从一个 miRNA 和三个基因表达数据集预测癌症生物标志物。采用包括 KFRS 在内的三种特征选择方法来发现生物标志物。该方法在微阵列数据集上的有效性得到了证明，并报告了从 miRNA 数据中识别出的癌症生物标志物。此外，还对 miRNA 癌症生物标志物进行了生物学意义测试。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0e7/4848046/476209e9aacc/chakr1-2375820.jpg

相似文献

Identifying Cancer Biomarkers From Microarray Data Using Feature Selection and Semisupervised Learning.

IEEE J Transl Eng Health Med. 2014 Dec 2;2:4300211. doi: 10.1109/JTEHM.2014.2375820. eCollection 2014.

Fuzzy preference based feature selection and semisupervised SVM for cancer classification.

IEEE Trans Nanobioscience. 2014 Jun;13(2):152-60. doi: 10.1109/TNB.2014.2312132.

Gene-expression-based cancer subtypes prediction through feature selection and transductive SVM.

IEEE Trans Biomed Eng. 2013 Apr;60(4):1111-7. doi: 10.1109/TBME.2012.2225622. Epub 2012 Oct 18.

Cancer survival classification using integrated data sets and intermediate information.

Artif Intell Med. 2014 Sep;62(1):23-31. doi: 10.1016/j.artmed.2014.06.003. Epub 2014 Jun 21.

Mixture classification model based on clinical markers for breast cancer prognosis.

Artif Intell Med. 2010 Feb-Mar;48(2-3):129-37. doi: 10.1016/j.artmed.2009.07.008. Epub 2009 Dec 14.

Validation of miRNAs as Breast Cancer Biomarkers with a Machine Learning Approach.

Cancers (Basel). 2019 Mar 26;11(3):431. doi: 10.3390/cancers11030431.

sigFeature: Novel Significant Feature Selection Method for Classification of Gene Expression Data Using Support Vector Machine and Statistic.

Front Genet. 2020 Apr 3;11:247. doi: 10.3389/fgene.2020.00247. eCollection 2020.

A fuzzy based feature selection from independent component subspace for machine learning classification of microarray data.

Genom Data. 2016 Feb 23;8:4-15. doi: 10.1016/j.gdata.2016.02.012. eCollection 2016 Jun.

Analysis of the microarray gene expression for breast cancer progression after the application modified logistic regression.

Gene. 2020 Feb 5;726:144168. doi: 10.1016/j.gene.2019.144168. Epub 2019 Nov 21.

A greedy algorithm for gene selection based on SVM and correlation.

Int J Bioinform Res Appl. 2010;6(3):296-307. doi: 10.1504/IJBRA.2010.034077.

引用本文的文献

Cancer Categorization Using Genetic Algorithm to Identify Biomarker Genes.

J Healthc Eng. 2022 Feb 22;2022:5821938. doi: 10.1155/2022/5821938. eCollection 2022.

Multi-view manifold regularized compact low-rank representation for cancer samples clustering on multi-omics data.

BMC Bioinformatics. 2022 Jan 20;22(Suppl 12):334. doi: 10.1186/s12859-021-04220-6.

The dysregulation of microarray gene expression in cervical cancer is associated with overexpression of a unique messenger RNA signature.

Iran J Microbiol. 2020 Dec;12(6):629-635. doi: 10.18502/ijm.v12i6.5039.

Machine Learning Based Computational Gene Selection Models: A Survey, Performance Evaluation, Open Issues, and Future Research Directions.

Front Genet. 2020 Dec 10;11:603808. doi: 10.3389/fgene.2020.603808. eCollection 2020.

A Multi-Modal Graph-Based Semi-Supervised Pipeline for Predicting Cancer Survival.

Proceedings (IEEE Int Conf Bioinformatics Biomed). 2016 Dec;2016:184-189. doi: 10.1109/bibm.2016.7822516. Epub 2017 Jan 19.

本文引用的文献

Fuzzy preference based feature selection and semisupervised SVM for cancer classification.

IEEE Trans Nanobioscience. 2014 Jun;13(2):152-60. doi: 10.1109/TNB.2014.2312132.

Multiclass gene selection using Pareto-fronts.

IEEE/ACM Trans Comput Biol Bioinform. 2013 Jan-Feb;10(1):87-97. doi: 10.1109/TCBB.2013.1.

Gene-expression-based cancer subtypes prediction through feature selection and transductive SVM.

IEEE Trans Biomed Eng. 2013 Apr;60(4):1111-7. doi: 10.1109/TBME.2012.2225622. Epub 2012 Oct 18.

Gene classification using parameter-free semi-supervised manifold learning.

IEEE/ACM Trans Comput Biol Bioinform. 2012 May-Jun;9(3):818-27. doi: 10.1109/TCBB.2011.152.

Multi-class clustering of cancer subtypes through SVM based ensemble of pareto-optimal solutions for gene marker identification.

PLoS One. 2010 Nov 12;5(11):e13803. doi: 10.1371/journal.pone.0013803.

Semi-supervised recursively partitioned mixture models for identifying cancer subtypes.

Bioinformatics. 2010 Oct 15;26(20):2578-85. doi: 10.1093/bioinformatics/btq470. Epub 2010 Aug 16.

Improving the computational efficiency of recursive cluster elimination for gene selection.

IEEE/ACM Trans Comput Biol Bioinform. 2011 Jan-Mar;8(1):122-9. doi: 10.1109/TCBB.2010.44.

Development of the human cancer microRNA network.

Silence. 2010 Feb 2;1(1):6. doi: 10.1186/1758-907X-1-6.

Combining Pareto-optimal clusters using supervised learning for identifying co-expressed genes.

BMC Bioinformatics. 2009 Jan 20;10:27. doi: 10.1186/1471-2105-10-27.

Clinically driven semi-supervised class discovery in gene expression data.

Bioinformatics. 2008 Aug 15;24(16):i90-7. doi: 10.1093/bioinformatics/btn279.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

基于特征选择和半监督学习的基因表达谱数据癌症生物标志物识别

Identifying Cancer Biomarkers From Microarray Data Using Feature Selection and Semisupervised Learning.

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译