Narasimhamurthy Anand
Department of Computer Science and Engineering, 341 IST Building, Pennsylvania State University, University Park, PA 16802, USA.
IEEE Trans Pattern Anal Mach Intell. 2005 Dec;27(12):1988-95. doi: 10.1109/TPAMI.2005.249.
A number of earlier studies that have attempted a theoretical analysis of majority voting assume independence of the classifiers. We formulate the majority voting problem as an optimization problem with linear constraints. No assumptions on the independence of classifiers are made. For a binary classification problem, given the accuracies of the classifiers in the team, the theoretical upper and lower bounds for performance obtained by combining them through majority voting are shown to be solutions of the corresponding optimization problem. The objective function of the optimization problem is nonlinear in the case of an even number of classifiers when rejection is allowed, for the other cases the objective function is linear and hence the problem is a linear program (LP). Using the framework we provide some insights and investigate the relationship between two candidate classifier diversity measures and majority voting performance.
许多早期尝试对多数投票进行理论分析的研究都假定分类器是独立的。我们将多数投票问题表述为一个具有线性约束的优化问题。没有对分类器的独立性做出任何假设。对于二元分类问题,给定团队中分类器的准确率,通过多数投票组合这些分类器所获得的性能的理论上限和下限被证明是相应优化问题的解。当允许拒绝时,在分类器数量为偶数的情况下,优化问题的目标函数是非线性的,对于其他情况,目标函数是线性的,因此该问题是一个线性规划(LP)。利用这个框架,我们提供了一些见解,并研究了两种候选分类器多样性度量与多数投票性能之间的关系。