准确性/多样性与集成多层感知器分类器设计。

Accuracy/diversity and ensemble MLP classifier design.

作者信息

Windeatt Terry

机构信息

Center for Vision, Speech, and Signal Processing (CVSSP), School of Electronics and Physical Sciences, University of Surrey, Guildford, Surrey GU2 7XH, UK.

出版信息

IEEE Trans Neural Netw. 2006 Sep;17(5):1194-211. doi: 10.1109/TNN.2006.875979.

DOI:10.1109/TNN.2006.875979

PMID:17001981

Abstract

The difficulties of tuning parameters of multilayer perceptrons (MLP) classifiers are well known. In this paper, a measure is described that is capable of predicting the number of classifier training epochs for achieving optimal performance in an ensemble of MLP classifiers. The measure is computed between pairs of patterns on the training data and is based on a spectral representation of a Boolean function. This representation characterizes the mapping from classifier decisions to target label and allows accuracy and diversity to be incorporated within a single measure. Results on many benchmark problems, including the Olivetti Research Laboratory (ORL) face database demonstrate that the measure is well correlated with base-classifier test error, and may be used to predict the optimal number of training epochs. While correlation with ensemble test error is not quite as strong, it is shown in this paper that the measure may be used to predict number of epochs for optimal ensemble performance. Although the technique is only applicable to two-class problems, it is extended here to multiclass through output coding. For the output-coding technique, a random code matrix is shown to give better performance than one-per-class code, even when the base classifier is well-tuned.

摘要

多层感知器（MLP）分类器调优参数的困难是众所周知的。本文描述了一种度量方法，它能够预测在MLP分类器集成中实现最优性能所需的分类器训练轮数。该度量是在训练数据上的模式对之间计算的，并且基于布尔函数的谱表示。这种表示刻画了从分类器决策到目标标签的映射，并允许在单一度量中纳入准确性和多样性。在许多基准问题上的结果，包括奥利维蒂研究实验室（ORL）人脸数据库，表明该度量与基分类器测试误差高度相关，并且可用于预测最优训练轮数。虽然与集成测试误差的相关性不是那么强，但本文表明该度量可用于预测实现最优集成性能所需的轮数。尽管该技术仅适用于两类问题，但在此通过输出编码将其扩展到多类问题。对于输出编码技术，即使基分类器经过良好调优，随机码矩阵也比一类一码表现更好。