Suppr超能文献

高维小样本数据分类的一些考虑。

Some considerations of classification for high dimension low-sample size data.

机构信息

1Department of Statistics, Purdue University, West Lafayette, IN, USA.

出版信息

Stat Methods Med Res. 2013 Oct;22(5):537-50. doi: 10.1177/0962280211428387. Epub 2011 Nov 23.

Abstract

We review in this article several classification methods, especially for high-dimensional and low-sample size data. We discuss several desirable properties for classifiers in such settings, including predictability, consistency, generality, stability, robustness and sparsity. Specifically, a good classifier should have a small prediction error (predictability); converge to the Bayes-rule classifier asymptotically (consistency); be stable when adding/removing an observation (generality); be stable for different data sets of the same kind (stochastic stability); be stable when there are a small number of contaminated observations (robustness); and have a small number of variables in the classifier (interpretability or sparsity). Several simulation examples and real applications are used to illustrate the usefulness of the existing popular classifiers and compare their performance.

摘要

本文回顾了几种分类方法,特别是针对高维、小样本量数据的分类方法。我们讨论了此类情况下分类器的几个理想属性,包括可预测性、一致性、泛化性、稳定性、鲁棒性和稀疏性。具体来说,一个好的分类器应该具有较小的预测误差(可预测性);渐近地收敛到贝叶斯规则分类器(一致性);在添加/删除观测值时保持稳定(泛化性);对于同一类的不同数据集保持稳定(随机稳定性);在存在少量污染观测值时保持稳定(鲁棒性);并且在分类器中具有较少的变量(可解释性或稀疏性)。本文使用了几个模拟示例和实际应用来说明现有的流行分类器的有用性,并比较了它们的性能。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验