Suppr超能文献

在模式识别中的维度、样本大小、分类错误和分类算法的复杂性。

On dimensionality, sample size, classification error, and complexity of classification algorithm in pattern recognition.

机构信息

Lietuvos RSR Moksly, Adademiha, Lenino, U.S.S.R.

出版信息

IEEE Trans Pattern Anal Mach Intell. 1980 Mar;2(3):242-52. doi: 10.1109/tpami.1980.4767011.

Abstract

This paper compares four classification algorithms-discriminant functions when classifying individuals into two multivariate populations. The discriminant functions (DF's) compared are derived according to the Bayes rule for normal populations and differ in assumptions on the covariance matrices' structure. Analytical formulas for the expected probability of misclassification EPN are derived and show that the classification error EPN depends on the structure of a classification algorithm, asymptotic probability of misclassification P¿, and the ratio of learning sample size N to dimensionality p:N/p for all linear DF's discussed and N2/p for quadratic DF's. The tables for learning quantity H = EPN/P¿ depending on parameters P¿, N, and p for four classifilcation algorithms analyzed are presented and may be used for estimating the necessary learning sample size, detennining the optimal number of features, and choosing the type of the classification algorithm in the case of a limited learning sample size.

摘要

本文比较了四种分类算法——判别函数,用于将个体分为两个多元总体。所比较的判别函数(DF)是根据正态总体的贝叶斯规则导出的,并且在协方差矩阵结构的假设上有所不同。推导了期望错误分类概率 EPN 的解析公式,表明分类错误 EPN 取决于分类算法的结构、渐近错误分类概率 P¿ 以及讨论的所有线性 DF 的学习样本大小 N 与维数 p 的比值 N/p 和二次 DF 的 N2/p。给出了学习量 H = EPN/P¿ 取决于四个分析的分类算法的参数 P¿、N 和 p 的表格,可用于估计所需的学习样本大小、确定最佳特征数量以及在学习样本量有限的情况下选择分类算法的类型。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验