应用于数据可视化的双向判别

Bidirectional discrimination with application to data visualization.

作者信息

Huang Hanwen, Liu Yufeng, Marron J S

机构信息

Center for Clinical and Translational Sciences, University of Texas Health Science Center at Houston, Houston, Texas 77030, U.S.A. ,

出版信息

Biometrika. 2012 Dec;99(4):851-864. doi: 10.1093/biomet/ass029. Epub 2012 Jul 24.

Abstract

Linear classifiers are very popular, but can have limitations when classes have distinct subpopulations. General nonlinear kernel classifiers are very flexible, but do not give clear interpretations and may not be efficient in high dimensions. We propose the bidirectional discrimination classification method, which generalizes linear classifiers to two or more hyperplanes. This new family of classification methods gives much of the flexibility of a general nonlinear classifier while maintaining the interpretability, and much of the parsimony, of linear classifiers. They provide a new visualization tool for high-dimensional, low-sample-size data. Although the idea is generally applicable, we focus on the generalization of the support vector machine and distance-weighted discrimination methods. The performance and usefulness of the proposed method are assessed using asymptotics and demonstrated through analysis of simulated and real data. Our method leads to better classification performance in high-dimensional situations where subclusters are present in the data.

摘要

线性分类器非常流行,但当类别具有不同的子群体时可能存在局限性。一般的非线性核分类器非常灵活,但无法给出清晰的解释,并且在高维情况下可能效率不高。我们提出了双向判别分类方法,该方法将线性分类器推广到两个或更多超平面。这个新的分类方法家族在保持线性分类器的可解释性和简约性的同时,具有一般非线性分类器的大部分灵活性。它们为高维、低样本量数据提供了一种新的可视化工具。尽管该思想普遍适用,但我们专注于支持向量机和距离加权判别方法的推广。使用渐近分析评估了所提出方法的性能和实用性,并通过对模拟数据和真实数据的分析进行了演示。我们的方法在数据中存在子簇的高维情况下能带来更好的分类性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索