使用基因表达谱和概率神经网络进行多类癌症分类。

Multiclass cancer classification using gene expression profiling and probabilistic neural networks.

作者信息

Berrar Daniel P, Downes C Stephen, Dubitzky Werner

机构信息

School of Biomedical Sciences, University of Ulster at Coleraine, BT521SA, Northern Ireland.

出版信息

Pac Symp Biocomput. 2003:5-16.

PMID:12603013

Abstract

Gene expression profiling by microarray technology has been successfully applied to classification and diagnostic prediction of cancers. Various machine learning and data mining methods are currently used for classifying gene expression data. However, these methods have not been developed to address the specific requirements of gene microarray analysis. First, microarray data is characterized by a high-dimensional feature space often exceeding the sample space dimensionality by a factor of 100 or more. In addition, microarray data exhibit a high degree of noise. Most of the discussed methods do not adequately address the problem of dimensionality and noise. Furthermore, although machine learning and data mining methods are based on statistics, most such techniques do not address the biologist's requirement for sound mathematical confidence measures. Finally, most machine learning and data mining classification methods fail to incorporate misclassification costs, i.e. they are indifferent to the costs associated with false positive and false negative classifications. In this paper, we present a probabilistic neural network (PNN) model that addresses all these issues. The PNN model provides sound statistical confidences for its decisions, and it is able to model asymmetrical misclassification costs. Furthermore, we demonstrate the performance of the PNN for multiclass gene expression data sets. Here, we compare the performance of the PNN with two machine learning methods, a decision tree and a neural network. To assess and evaluate the performance of the classifiers, we use a lift-based scoring system that allows a fair comparison of different models. The PNN clearly outperformed the other models. The results demonstrate the successful application of the PNN model for multiclass cancer classification.

摘要

通过微阵列技术进行基因表达谱分析已成功应用于癌症的分类和诊断预测。目前，各种机器学习和数据挖掘方法被用于对基因表达数据进行分类。然而，这些方法尚未针对基因微阵列分析的特定要求进行开发。首先，微阵列数据的特征在于高维特征空间，其通常比样本空间维度大100倍或更多。此外，微阵列数据表现出高度的噪声。大多数讨论的方法没有充分解决维度和噪声问题。此外，尽管机器学习和数据挖掘方法基于统计学，但大多数此类技术并未满足生物学家对可靠数学置信度度量的要求。最后，大多数机器学习和数据挖掘分类方法未能纳入错误分类成本，即它们对与假阳性和假阴性分类相关的成本不敏感。在本文中，我们提出了一种概率神经网络（PNN）模型来解决所有这些问题。PNN模型为其决策提供了可靠的统计置信度，并且能够对不对称的错误分类成本进行建模。此外，我们展示了PNN在多类基因表达数据集上的性能。在这里，我们将PNN的性能与两种机器学习方法（决策树和神经网络）进行比较。为了评估和评价分类器的性能，我们使用基于提升的评分系统，该系统允许对不同模型进行公平比较。PNN明显优于其他模型。结果证明了PNN模型在多类癌症分类中的成功应用。

相似文献

Multiclass cancer classification using gene expression profiling and probabilistic neural networks.

Pac Symp Biocomput. 2003:5-16.

Gene selection from microarray data for cancer classification--a machine learning approach.

Comput Biol Chem. 2005 Feb;29(1):37-46. doi: 10.1016/j.compbiolchem.2004.11.001.

Hybrid genetic algorithm-neural network: feature extraction for unpreprocessed microarray data.

Artif Intell Med. 2011 Sep;53(1):47-56. doi: 10.1016/j.artmed.2011.06.008. Epub 2011 Jul 19.

Multiclass cancer classification using semisupervised ellipsoid ARTMAP and particle swarm optimization with gene expression data.

IEEE/ACM Trans Comput Biol Bioinform. 2007 Jan-Mar;4(1):65-77. doi: 10.1109/TCBB.2007.1009.

Tumor classification ranking from microarray data.

BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S21. doi: 10.1186/1471-2164-9-S2-S21.

GenSo-FDSS: a neural-fuzzy decision support system for pediatric ALL cancer subtype identification using gene expression data.

Artif Intell Med. 2005 Jan;33(1):61-88. doi: 10.1016/j.artmed.2004.03.009.

A new classification model with simple decision rule for discovering optimal feature gene pairs.

Comput Biol Med. 2007 Nov;37(11):1637-46. doi: 10.1016/j.compbiomed.2007.03.004. Epub 2007 May 7.

Mixture classification model based on clinical markers for breast cancer prognosis.

Artif Intell Med. 2010 Feb-Mar;48(2-3):129-37. doi: 10.1016/j.artmed.2009.07.008. Epub 2009 Dec 14.

Reducing multiclass cancer classification to binary by output coding and SVM.

Comput Biol Chem. 2006 Feb;30(1):63-71. doi: 10.1016/j.compbiolchem.2005.10.008.

Tumor classification by combining PNN classifier ensemble with neighborhood rough set based gene reduction.

Comput Biol Med. 2010 Feb;40(2):179-89. doi: 10.1016/j.compbiomed.2009.11.014. Epub 2009 Dec 30.

引用本文的文献

Bayesian approach for predicting responses to therapy from high-dimensional time-course gene expression profiles.

BMC Bioinformatics. 2021 Mar 18;22(1):132. doi: 10.1186/s12859-021-04052-4.

Application of machine learning in the management of acute myeloid leukemia: current practice and future prospects.

Blood Adv. 2020 Dec 8;4(23):6077-6085. doi: 10.1182/bloodadvances.2020002997.

Lopinavir Resistance Classification with Imbalanced Data Using Probabilistic Neural Networks.

J Med Syst. 2016 Mar;40(3):69. doi: 10.1007/s10916-015-0428-7. Epub 2016 Jan 6.

Use of kernel-based Bayesian models to predict late osteolysis after hip replacement.

J R Soc Interface. 2013 Sep 18;10(88):20130678. doi: 10.1098/rsif.2013.0678. Print 2013 Nov 6.

Gene expression based leukemia sub-classification using committee neural networks.

Bioinform Biol Insights. 2009 Sep 3;3:89-98. doi: 10.4137/bbi.s2908.

A white-box approach to microarray probe response characterization: the BaFL pipeline.

BMC Bioinformatics. 2009 Dec 29;10:449. doi: 10.1186/1471-2105-10-449.

Prion disease diagnosis by proteomic profiling.

J Proteome Res. 2009 Feb;8(2):1030-6. doi: 10.1021/pr800832s.

Classification algorithms for phenotype prediction in genomics and proteomics.

Front Biosci. 2008 Jan 1;13:691-708. doi: 10.2741/2712.

A simple method to combine multiple molecular biomarkers for dichotomous diagnostic classification.

BMC Bioinformatics. 2006 Oct 10;7:442. doi: 10.1186/1471-2105-7-442.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用基因表达谱和概率神经网络进行多类癌症分类。

Multiclass cancer classification using gene expression profiling and probabilistic neural networks.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献