基于系统抽样和层次聚类 (SSHC) 算法的多类 SVM 基因分类。

Gene Classification Based on Multi-Class SVMs with Systematic Sampling and Hierarchical Clustering (SSHC) Algorithm.

机构信息

University of Sulaimani, Collage of Science, Computer Department, Sulaymaniyah, Iraq.

出版信息

Adv Exp Med Biol. 2021;1338:231-237. doi: 10.1007/978-3-030-78775-2_28.

Abstract

The support vector machines (SVMs) is one of the machine learning algorithms with high classification accuracy. However, the support vector machine algorithm has a very high training complexity. Thus, it is not very efficient with large datasets. In this study, we have used the multi-class support vector machines and systematic sampling with hierarchical clustering (SSHC-MCSVM) algorithm for gene expression data classification. The gene expression profiles are considered as large datasets. The gene expression datasets that are used in this study are two datasets for obese and lean individuals. In this proposed (SSHC-MCSVM) algorithm, the gene expression data are regrouped to new sets of genes based on systematic sampling with hierarchical clustering (SSHC) algorithm. The SSHC algorithm repeated n times and the k-partitions with clusters that have high adjusted Rand index (ARI) are chosen. The multi-class support vector machines are applied to the best regrouped gene expression data to classify the significant genes. The performance measures are accuracy, recall, and precision. The proposed algorithm which is SSHC-MCSVM could classify the significant genes with high accuracy, recall, and precision.

摘要

支持向量机（SVMs）是一种具有高精度分类能力的机器学习算法。然而，支持向量机算法的训练复杂度非常高。因此，对于大型数据集来说，效率不是很高。在这项研究中，我们使用多类支持向量机和基于分层聚类的系统抽样（SSHC-MCSVM）算法对基因表达数据进行分类。基因表达谱被视为大型数据集。本研究中使用的基因表达数据集是两组肥胖和瘦个体的数据集。在提出的（SSHC-MCSVM）算法中，根据基于分层聚类的系统抽样（SSHC）算法将基因表达数据重新组合成新的基因集。SSHC 算法重复 n 次，并选择具有高调整兰德指数（ARI）的聚类的 k 个分区。多类支持向量机应用于最佳重组的基因表达数据，以对显著基因进行分类。性能度量包括准确性、召回率和精度。提出的 SSHC-MCSVM 算法可以以高精度、高召回率和高精度对显著基因进行分类。

相似文献

Gene Classification Based on Multi-Class SVMs with Systematic Sampling and Hierarchical Clustering (SSHC) Algorithm.

Adv Exp Med Biol. 2021;1338:231-237. doi: 10.1007/978-3-030-78775-2_28.

Digging for Significant Genes in Microarray Expression Data Based on Systematic Sampling and Hierarchal Clustering Algorithm.

Adv Exp Med Biol. 2021;1338:1-6. doi: 10.1007/978-3-030-78775-2_1.

Vicinal support vector classifier using supervised kernel-based clustering.

Artif Intell Med. 2014 Mar;60(3):189-96. doi: 10.1016/j.artmed.2014.01.003. Epub 2014 Feb 7.

Improved Regularized Multi-class Logistic Regression for Gene Classification with Optimal Kernel PCA and HC Algorithm.

Adv Exp Med Biol. 2023;1424:273-279. doi: 10.1007/978-3-031-31982-2_31.

Possibilistic classification by support vector networks.

Neural Netw. 2022 May;149:40-56. doi: 10.1016/j.neunet.2022.02.007. Epub 2022 Feb 12.

A graph-based gene selection method for medical diagnosis problems using a many-objective PSO algorithm.

BMC Med Inform Decis Mak. 2021 Nov 27;21(1):333. doi: 10.1186/s12911-021-01696-3.

Aligning text mining and machine learning algorithms with best practices for study selection in systematic literature reviews.

Syst Rev. 2020 Dec 13;9(1):293. doi: 10.1186/s13643-020-01520-5.

A distance-based kernel for classification via Support Vector Machines.

Front Artif Intell. 2024 Feb 26;7:1287875. doi: 10.3389/frai.2024.1287875. eCollection 2024.

Fuzzy support vector machine: an efficient rule-based classification technique for microarrays.

BMC Bioinformatics. 2013;14 Suppl 13(Suppl 13):S4. doi: 10.1186/1471-2105-14-S13-S4. Epub 2013 Oct 1.

Inverse free reduced universum twin support vector machine for imbalanced data classification.

Neural Netw. 2023 Jan;157:125-135. doi: 10.1016/j.neunet.2022.10.003. Epub 2022 Oct 15.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于系统抽样和层次聚类 (SSHC) 算法的多类 SVM 基因分类。

Gene Classification Based on Multi-Class SVMs with Systematic Sampling and Hierarchical Clustering (SSHC) Algorithm.

机构信息

出版信息

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献