Suppr超能文献

改进最近收缩质心分类器的质心估计

Improved centroids estimation for the nearest shrunken centroid classifier.

作者信息

Wang Sijian, Zhu Ji

机构信息

Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.

出版信息

Bioinformatics. 2007 Apr 15;23(8):972-9. doi: 10.1093/bioinformatics/btm046. Epub 2007 Mar 24.

Abstract

MOTIVATION

The nearest shrunken centroid (NSC) method has been successfully applied in many DNA-microarray classification problems. The NSC uses 'shrunken' centroids as prototypes for each class and identifies subsets of genes that best characterize each class. Classification is then made to the nearest (shrunken) centroid. The NSC is very easy to implement and very easy to interpret, however, it has drawbacks.

RESULTS

We show that the NSC method can be interpreted in the framework of LASSO regression. Based on that, we consider two new methods, adaptive L(infinity)-norm penalized NSC (ALP-NSC) and adaptive hierarchically penalized NSC (AHP-NSC), with two different penalty functions for microarray classification, which improve over the NSC. Unlike the L(1)-norm penalty used in LASSO, the penalty terms that we consider make use of the fact that parameters belonging to one gene should be treated as a natural group. Numerical results indicate that the two new methods tend to remove irrelevant genes more effectively and provide better classification results than the L(1)-norm approach.

AVAILABILITY

R code for the ALP-NSC and the AHP-NSC algorithms are available from authors upon request.

摘要

动机

最近收缩质心(NSC)方法已成功应用于许多DNA微阵列分类问题。NSC使用“收缩”质心作为每个类别的原型,并识别最能表征每个类别的基因子集。然后将样本分类到最近的(收缩)质心。NSC非常易于实现且易于解释,然而,它也存在缺点。

结果

我们表明NSC方法可以在LASSO回归框架中进行解释。基于此,我们考虑了两种新方法,自适应L(无穷)范数惩罚NSC(ALP - NSC)和自适应分层惩罚NSC(AHP - NSC),它们使用两种不同的惩罚函数进行微阵列分类,比NSC有所改进。与LASSO中使用的L(1)范数惩罚不同,我们考虑的惩罚项利用了属于一个基因的参数应被视为一个自然组这一事实。数值结果表明,这两种新方法比L(1)范数方法更倾向于有效地去除无关基因,并提供更好的分类结果。

可用性

可根据作者要求提供ALP - NSC和AHP - NSC算法的R代码。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验