Subasi M, Subasi E, Anthony M, Hammer P L
RUTCOR, Rutgers University, 640 Bartholomew Road, Piscataway, NJ 08854-8003, USA.
Discrete Appl Math. 2009 Mar 6;157(5):1104-1112. doi: 10.1016/j.dam.2008.04.007.
This paper concerns classification by Boolean functions. We investigate the classification accuracy obtained by standard classification techniques on unseen points (elements of the domain, {0, 1}(n), for some n) that are similar, in particular senses, to the points that have been observed as training observations. Explicitly, we use a new measure of how similar a point x in {0, 1}(n) is to a set of such points to restrict the domain of points on which we offer a classification. For points sufficiently dissimilar, no classification is given. We report on experimental results which indicate that the classification accuracies obtained on the resulting restricted domains are better than those obtained without restriction. These experiments involve a number of standard data-sets and classification techniques. We also compare the classification accuracies with those obtained by restricting the domain on which classification is given by using the Hamming distance.
本文关注基于布尔函数的分类。我们研究了标准分类技术在未见点(对于某个(n),定义域({0, 1}^n)中的元素)上获得的分类准确率,这些未见点在特定意义上与作为训练观测值所观察到的点相似。具体而言,我们使用一种新的度量来衡量({0, 1}^n)中的点(x)与一组此类点的相似程度,以限制我们提供分类的点的定义域。对于差异足够大的点,则不给出分类。我们报告的实验结果表明,在所得受限定义域上获得的分类准确率优于无限制情况下获得的准确率。这些实验涉及多个标准数据集和分类技术。我们还将分类准确率与通过使用汉明距离限制分类定义域所获得的准确率进行了比较。