Suppr超能文献

基于子空间共识核分类的癌症分子模式发现

Cancer molecular pattern discovery by subspace consensus kernel classification.

作者信息

Han Xiaoxu

机构信息

Department of Mathematics and Bioinformatics Program, Eastern Michigan University, Ypsilanti, MI 48197, USA.

出版信息

Comput Syst Bioinformatics Conf. 2007;6:55-65.

Abstract

Cancer molecular pattern efficient discovery is essential in the molecular diagnostics. The characteristics of the gene/protein expression data are challenging traditional unsupervised classification algorithms. In this work, we describe a subspace consensus kernel clustering algorithm based on the projected gradient nonnegative matrix factorization (PG-NMF). The algorithm is a consensus kernel hierarchical clustering (CKHC) method in the subspace generated by the PG-NMF. It integrates convergence-soundness parts-based learning, subspace and kernel space clustering in the microarray and proteomics data classification. We first integrated subspace methods and kernel methods by following our framework of the input space, subspace and kernel space clustering. We demonstrate more effective classification results from our algorithm by comparison with those of the classic NMF, sparse-NMF classifications and supervised classifications (KNN and SVM) for the four benchmark cancer datasets. Our algorithm can generate a family of classification algorithms in machine learning by selecting different transforms to generate subspaces and different kernel clustering algorithms to cluster data.

摘要

癌症分子模式的有效发现对于分子诊断至关重要。基因/蛋白质表达数据的特征对传统的无监督分类算法提出了挑战。在这项工作中,我们描述了一种基于投影梯度非负矩阵分解(PG-NMF)的子空间共识核聚类算法。该算法是PG-NMF生成的子空间中的一种共识核层次聚类(CKHC)方法。它在微阵列和蛋白质组学数据分类中集成了基于收敛稳健部分的学习、子空间和核空间聚类。我们首先按照输入空间、子空间和核空间聚类的框架集成了子空间方法和核方法。通过与四个基准癌症数据集的经典NMF、稀疏NMF分类以及监督分类(KNN和SVM)进行比较,我们证明了我们的算法具有更有效的分类结果。通过选择不同的变换来生成子空间以及不同的核聚类算法来对数据进行聚类,我们的算法可以在机器学习中生成一系列分类算法。

相似文献

2
Multiclass molecular cancer classification by kernel subspace methods with effective kernel parameter selection.
J Bioinform Comput Biol. 2005 Oct;3(5):1071-88. doi: 10.1142/s0219720005001491.
3
4
Parallelization of multicategory support vector machines (PMC-SVM) for classifying microarray data.
BMC Bioinformatics. 2006 Dec 12;7 Suppl 4(Suppl 4):S15. doi: 10.1186/1471-2105-7-S4-S15.
5
Eigengene-based linear discriminant model for tumor classification using gene expression microarray data.
Bioinformatics. 2006 Nov 1;22(21):2635-42. doi: 10.1093/bioinformatics/btl442. Epub 2006 Aug 22.
6
Bagging linear sparse Bayesian learning models for variable selection in cancer diagnosis.
IEEE Trans Inf Technol Biomed. 2007 May;11(3):338-47. doi: 10.1109/titb.2006.889702.
7
Improving molecular cancer class discovery through sparse non-negative matrix factorization.
Bioinformatics. 2005 Nov 1;21(21):3970-5. doi: 10.1093/bioinformatics/bti653.
8
Multiclass cancer classification and biomarker discovery using GA-based algorithms.
Bioinformatics. 2005 Jun 1;21(11):2691-7. doi: 10.1093/bioinformatics/bti419. Epub 2005 Apr 6.
9
Development of two-stage SVM-RFE gene selection strategy for microarray expression data analysis.
IEEE/ACM Trans Comput Biol Bioinform. 2007 Jul-Sep;4(3):365-81. doi: 10.1109/TCBB.2007.70224.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验