Suppr超能文献

不同DNA微阵列数据分类算法的比较研究

Comparative Study of Classification Algorithms for Various DNA Microarray Data.

作者信息

Kim Jingeun, Yoon Yourim, Park Hye-Jin, Kim Yong-Hyuk

机构信息

Department of IT Convergence Engineering, Gachon University, Seongnam-daero 1342, Seongnam-si 13120, Korea.

Department of Computer Engineering, College of Information Technology, Gachon University, Seongnam-daero 1342, Sujeong-gu, Seongnam-si 13120, Korea.

出版信息

Genes (Basel). 2022 Mar 11;13(3):494. doi: 10.3390/genes13030494.

Abstract

Microarrays are applications of electrical engineering and technology in biology that allow simultaneous measurement of expression of numerous genes, and they can be used to analyze specific diseases. This study undertakes classification analyses of various microarrays to compare the performances of classification algorithms over different data traits. The datasets were classified into test and control groups based on five utilized machine learning methods, including MultiLayer Perceptron (MLP), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), and -Nearest Neighbors (KNN), and the resulting accuracies were compared. -fold cross-validation was used in evaluating the performance and the result was analyzed by comparing the performances of the five machine learning methods. Through the experiments, it was observed that the two tree-based methods, DT and RF, showed similar trends in results and the remaining three methods, MLP, SVM, and DT, showed similar trends. DT and RF generally showed worse performance than other methods except for one dataset. This suggests that, for the effective classification of microarray data, selecting a classification algorithm that is suitable for data traits is crucial to ensure optimum performance.

摘要

微阵列是电气工程和技术在生物学中的应用,它允许同时测量众多基因的表达,并且可用于分析特定疾病。本研究对各种微阵列进行分类分析,以比较不同分类算法在不同数据特征上的性能。基于五种机器学习方法,包括多层感知器(MLP)、支持向量机(SVM)、决策树(DT)、随机森林(RF)和K近邻(KNN),将数据集分为测试组和对照组,并比较所得的准确率。在评估性能时使用了十折交叉验证,并通过比较这五种机器学习方法的性能来分析结果。通过实验观察到,两种基于树的方法,DT和RF,结果显示出相似的趋势,其余三种方法,MLP、SVM和KNN,也显示出相似的趋势。除了一个数据集外,DT和RF的性能通常比其他方法差。这表明,对于微阵列数据的有效分类,选择适合数据特征的分类算法对于确保最佳性能至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf91/8951024/a76135371d99/genes-13-00494-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验