基于DNA微阵列数据的正则化最小二乘癌症分类器。

Regularized Least Squares Cancer classifiers from DNA microarray data.

作者信息

Ancona Nicola, Maglietta Rosalia, D'Addabbo Annarita, Liuni Sabino, Pesole Graziano

机构信息

Istituto di Studi sui Sistemi Intelligenti per I'Automazione, CNR, Via Amendola 122/D-I, 70126 Bari, Italy.

出版信息

BMC Bioinformatics. 2005 Dec 1;6 Suppl 4(Suppl 4):S2. doi: 10.1186/1471-2105-6-S4-S2.

DOI:10.1186/1471-2105-6-S4-S2

PMID:16351746

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1866388/

Abstract

BACKGROUND

The advent of the technology of DNA microarrays constitutes an epochal change in the classification and discovery of different types of cancer because the information provided by DNA microarrays allows an approach to the problem of cancer analysis from a quantitative rather than qualitative point of view. Cancer classification requires well founded mathematical methods which are able to predict the status of new specimens with high significance levels starting from a limited number of data. In this paper we assess the performances of Regularized Least Squares (RLS) classifiers, originally proposed in regularization theory, by comparing them with Support Vector Machines (SVM), the state-of-the-art supervised learning technique for cancer classification by DNA microarray data. The performances of both approaches have been also investigated with respect to the number of selected genes and different gene selection strategies.

RESULTS

We show that RLS classifiers have performances comparable to those of SVM classifiers as the Leave-One-Out (LOO) error evaluated on three different data sets shows. The main advantage of RLS machines is that for solving a classification problem they use a linear system of order equal to either the number of features or the number of training examples. Moreover, RLS machines allow to get an exact measure of the LOO error with just one training.

CONCLUSION

RLS classifiers are a valuable alternative to SVM classifiers for the problem of cancer classification by gene expression data, due to their simplicity and low computational complexity. Moreover, RLS classifiers show generalization ability comparable to the ones of SVM classifiers also in the case the classification of new specimens involves very few gene expression levels.

摘要

背景

DNA微阵列技术的出现为不同类型癌症的分类和发现带来了划时代的变革，因为DNA微阵列提供的信息使得我们能够从定量而非定性的角度来解决癌症分析问题。癌症分类需要有坚实数学基础的方法，这些方法能够从有限的数据出发，以高显著水平预测新样本的状态。在本文中，我们通过将正则化最小二乘（RLS）分类器与支持向量机（SVM）（用于通过DNA微阵列数据进行癌症分类的最先进监督学习技术）进行比较，来评估最初在正则化理论中提出的RLS分类器的性能。还针对所选基因的数量和不同的基因选择策略研究了这两种方法的性能。

结果

正如在三个不同数据集上评估的留一法（LOO）误差所示，我们表明RLS分类器的性能与SVM分类器相当。RLS机器的主要优点在于，为了解决分类问题，它们使用的线性系统的阶数等于特征数量或训练示例数量。此外，RLS机器只需一次训练就能得到LOO误差的精确度量。

结论

由于其简单性和低计算复杂度，对于通过基因表达数据进行癌症分类的问题，RLS分类器是SVM分类器的一个有价值的替代方案。此外，在新样本的分类涉及很少基因表达水平的情况下，RLS分类器也表现出与SVM分类器相当的泛化能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c630/1866388/d45ebc903e59/1471-2105-6-S4-S2-1.jpg

相似文献

Regularized Least Squares Cancer classifiers from DNA microarray data.

BMC Bioinformatics. 2005 Dec 1;6 Suppl 4(Suppl 4):S2. doi: 10.1186/1471-2105-6-S4-S2.

On the statistical assessment of classifiers using DNA microarray data.

BMC Bioinformatics. 2006 Aug 19;7:387. doi: 10.1186/1471-2105-7-387.

Low rank updated LS-SVM classifiers for fast variable selection.

Neural Netw. 2008 Mar-Apr;21(2-3):437-49. doi: 10.1016/j.neunet.2007.12.053. Epub 2008 Feb 2.

Many accurate small-discriminatory feature subsets exist in microarray transcript data: biomarker discovery.

BMC Bioinformatics. 2005 Apr 13;6:97. doi: 10.1186/1471-2105-6-97.

[Application of support vector machines to classification of blood cells].

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2003 Sep;20(3):484-7.

Bias in error estimation when using cross-validation for model selection.

BMC Bioinformatics. 2006 Feb 23;7:91. doi: 10.1186/1471-2105-7-91.

Effects of replacing the unreliable cDNA microarray measurements on the disease classification based on gene expression profiles and functional modules.

Bioinformatics. 2006 Dec 1;22(23):2883-9. doi: 10.1093/bioinformatics/btl339. Epub 2006 Jun 29.

Improving gene expression cancer molecular pattern discovery using nonnegative principal component analysis.

Genome Inform. 2008;21:200-11.

Selecting subsets of newly extracted features from PCA and PLS in microarray data analysis.

BMC Genomics. 2008 Sep 16;9 Suppl 2(Suppl 2):S24. doi: 10.1186/1471-2164-9-S2-S24.

Gene selection using support vector machines with non-convex penalty.

Bioinformatics. 2006 Jan 1;22(1):88-95. doi: 10.1093/bioinformatics/bti736. Epub 2005 Oct 25.

引用本文的文献

An Integrated Local Classification Model of Predicting Drug-Drug Interactions via Dempster-Shafer Theory of Evidence.

Sci Rep. 2018 Aug 7;8(1):11829. doi: 10.1038/s41598-018-30189-z.

SSCMDA: spy and super cluster strategy for MiRNA-disease association prediction.

Oncotarget. 2017 Dec 1;9(2):1826-1842. doi: 10.18632/oncotarget.22812. eCollection 2018 Jan 5.

Predicting existing targets for new drugs base on strategies for missing interactions.

BMC Bioinformatics. 2016 Aug 31;17 Suppl 8(Suppl 8):282. doi: 10.1186/s12859-016-1118-2.

Biological and functional analysis of statistically significant pathways deregulated in colon cancer by using gene expression profiles.

Int J Biol Sci. 2008;4(6):368-78. doi: 10.7150/ijbs.4.368. Epub 2008 Oct 14.

On the statistical assessment of classifiers using DNA microarray data.

BMC Bioinformatics. 2006 Aug 19;7:387. doi: 10.1186/1471-2105-7-387.

本文引用的文献

Pattern recognition in gene expression profiling using DNA array: a comparative study of different statistical methods applied to cancer classification.

Hum Mol Genet. 2003 Apr 15;12(8):823-36. doi: 10.1093/hmg/ddg093.

Diagnosis of multiple cancer types by shrunken centroids of gene expression.

Proc Natl Acad Sci U S A. 2002 May 14;99(10):6567-72. doi: 10.1073/pnas.082099299.

Selection bias in gene extraction on the basis of microarray gene-expression data.

Proc Natl Acad Sci U S A. 2002 May 14;99(10):6562-6. doi: 10.1073/pnas.102102699. Epub 2002 Apr 30.

Multiclass cancer diagnosis using tumor gene expression signatures.

Proc Natl Acad Sci U S A. 2001 Dec 18;98(26):15149-54. doi: 10.1073/pnas.211566398. Epub 2001 Dec 11.

Knowledge-based analysis of microarray gene expression data by using support vector machines.

Proc Natl Acad Sci U S A. 2000 Jan 4;97(1):262-7. doi: 10.1073/pnas.97.1.262.

Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.

Science. 1999 Oct 15;286(5439):531-7. doi: 10.1126/science.286.5439.531.

Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays.

Proc Natl Acad Sci U S A. 1999 Jun 8;96(12):6745-50. doi: 10.1073/pnas.96.12.6745.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于DNA微阵列数据的正则化最小二乘癌症分类器。

Regularized Least Squares Cancer classifiers from DNA microarray data.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献