使用核机器从微阵列数据中进行基因选择和分类。

Gene selection and classification from microarray data using kernel machine.

作者信息

Cho Ji-Hoon, Lee Dongkwon, Park Jin Hyun, Lee In-Beum

机构信息

Department of Chemical Engineering, Pohang University of Science and Technology, San 31 Hyoja-Dong, Pohang 790-784, Republic of Korea.

出版信息

FEBS Lett. 2004 Jul 30;571(1-3):93-8. doi: 10.1016/j.febslet.2004.05.087.

DOI:10.1016/j.febslet.2004.05.087

PMID:15280023

Abstract

The discrimination of cancer patients (including subtypes) based on gene expression data is a critical problem with clinical ramifications. Central to solving this problem is the issue of how to extract the most relevant genes from the several thousand genes on a typical microarray. Here, we propose a methodology that can effectively select an informative subset of genes and classify the subtypes (or patients) of disease using the selected genes. We employ a kernel machine, kernel Fisher discriminant analysis (KFDA), for discrimination and use the derivatives of the kernel function to perform gene selection. Using a modified form of KFDA in the minimum squared error (MSE) sense and the gradients of the kernel functions, we construct an effective gene selection criterion. We assess the performance of the proposed methodology by applying it to three gene expression datasets: leukemia dataset, breast cancer dataset and colon cancer dataset. Using a few informative genes, the proposed method accurately and reliably classified cancer subtypes (or patients). Also, through a comparison study, we verify the reliability of the gene selection and discrimination results.

摘要

基于基因表达数据对癌症患者（包括亚型）进行区分是一个具有临床影响的关键问题。解决此问题的核心在于如何从典型微阵列上的数千个基因中提取最相关的基因。在此，我们提出一种方法，该方法可以有效地选择信息丰富的基因子集，并使用所选基因对疾病的亚型（或患者）进行分类。我们采用核机器，即核 Fisher 判别分析（KFDA）进行区分，并使用核函数的导数来进行基因选择。通过在最小平方误差（MSE）意义下使用 KFDA 的改进形式以及核函数的梯度，我们构建了一个有效的基因选择标准。我们将所提出的方法应用于三个基因表达数据集：白血病数据集、乳腺癌数据集和结肠癌数据集，以评估其性能。使用少数信息丰富的基因，所提出的方法准确且可靠地对癌症亚型（或患者）进行了分类。此外，通过比较研究，我们验证了基因选择和区分结果的可靠性。

相似文献

Gene selection and classification from microarray data using kernel machine.

FEBS Lett. 2004 Jul 30;571(1-3):93-8. doi: 10.1016/j.febslet.2004.05.087.

Variable selection using probability density function similarity for support vector machine classification of high-dimensional microarray data.

Talanta. 2009 Jul 15;79(2):260-7. doi: 10.1016/j.talanta.2009.03.044. Epub 2009 Mar 31.

A multiple kernel support vector machine scheme for feature selection and rule extraction from gene expression data of cancer tissue.

Artif Intell Med. 2007 Oct;41(2):161-75. doi: 10.1016/j.artmed.2007.07.008. Epub 2007 Sep 11.

Tumor classification ranking from microarray data.

BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S21. doi: 10.1186/1471-2164-9-S2-S21.

A new classification model with simple decision rule for discovering optimal feature gene pairs.

Comput Biol Med. 2007 Nov;37(11):1637-46. doi: 10.1016/j.compbiomed.2007.03.004. Epub 2007 May 7.

Reliable gene signatures for microarray classification: assessment of stability and performance.

Bioinformatics. 2006 Oct 1;22(19):2356-63. doi: 10.1093/bioinformatics/btl400. Epub 2006 Jul 31.

f-Information measures for efficient selection of discriminative genes from microarray data.

IEEE Trans Biomed Eng. 2009 Apr;56(4):1063-9. doi: 10.1109/TBME.2008.2004502. Epub 2008 Sep 16.

Cancer classification and prediction using logistic regression with Bayesian gene selection.

J Biomed Inform. 2004 Aug;37(4):249-59. doi: 10.1016/j.jbi.2004.07.009.

New variable selection method using interval segmentation purity with application to blockwise kernel transform support vector machine classification of high-dimensional microarray data.

J Chem Inf Model. 2009 Aug;49(8):2002-9. doi: 10.1021/ci900032q.

Simultaneous genes and training samples selection by modified particle swarm optimization for gene expression data classification.

Comput Biol Med. 2009 Jul;39(7):646-9. doi: 10.1016/j.compbiomed.2009.04.008. Epub 2009 May 28.

引用本文的文献

Determination of biomarkers from microarray data using graph neural network and spectral clustering.

Sci Rep. 2021 Dec 13;11(1):23828. doi: 10.1038/s41598-021-03316-6.

Classification of COVID-19 by using supervised optimized machine learning technique.

Mater Today Proc. 2022;56:2058-2062. doi: 10.1016/j.matpr.2021.11.388. Epub 2021 Nov 29.

Predicting brain metastases for non-small cell lung cancer based on magnetic resonance imaging.

Clin Exp Metastasis. 2017 Feb;34(2):115-124. doi: 10.1007/s10585-016-9833-7. Epub 2017 Jan 18.

Classification of Microarray Data Using Kernel Fuzzy Inference System.

Int Sch Res Notices. 2014 Aug 21;2014:769159. doi: 10.1155/2014/769159. eCollection 2014.

Molecular phenotyping of a UK population: defining the human serum metabolome.

Metabolomics. 2015;11(1):9-26. doi: 10.1007/s11306-014-0707-1. Epub 2014 Jul 25.

A novel method incorporating gene ontology information for unsupervised clustering and feature selection.

PLoS One. 2008;3(12):e3860. doi: 10.1371/journal.pone.0003860. Epub 2008 Dec 4.

A weighted average difference method for detecting differentially expressed genes from microarray data.

Algorithms Mol Biol. 2008 Jun 26;3:8. doi: 10.1186/1748-7188-3-8.

Gene selection with multiple ordering criteria.

BMC Bioinformatics. 2007 Mar 5;8:74. doi: 10.1186/1471-2105-8-74.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用核机器从微阵列数据中进行基因选择和分类。

Gene selection and classification from microarray data using kernel machine.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献