1 Department of Systems and Computer Networks, Wroclaw University of Science and Technology, Wybrzeze Wyspianskiego 27, 50-370 Wroclaw, Poland.
Int J Neural Syst. 2018 Nov;28(9):1750062. doi: 10.1142/S0129065717500629. Epub 2017 Dec 17.
In the paper, the problem of multi-label (ML) classification using the label-pairwise (LPW) scheme is addressed. For this approach, the method of correction of binary classifiers which constitute the LPW ensemble is proposed. The correction is based on a probabilistic (randomized) model of a classifier that assesses the local class-specific probabilities of correct classification and misclassification. These probabilities are determined using the original concepts of a randomized reference classifier (RRC) and a local soft confusion matrix. Additionally, two special cases that deal with imbalanced labels and double labeled instances are considered. The proposed methods were evaluated using 29 benchmark datasets. In order to assess the efficiency of the introduced models and the proposed correction scheme, they were compared against original binary classifiers working in the LPW ensemble. The comparison was performed using four different ML evaluation measures: macro and micro-averaged [Formula: see text] loss, zero-one loss and Hamming loss. Moreover, relations between classification quality and the characteristics of ML datasets such as average imbalance ratio or label density were investigated. The experimental study reveals that the correction approaches significantly outperform the reference method in terms of zero-one loss and Hamming loss.
本文研究了使用标签对(LPW)方案进行多标签(ML)分类的问题。对于这种方法,提出了一种修正构成 LPW 集成的二进制分类器的方法。这种修正基于一个分类器的概率(随机化)模型,该模型评估正确分类和错误分类的局部类特定概率。这些概率是使用随机参考分类器(RRC)和局部软混淆矩阵的原始概念确定的。此外,还考虑了两种特殊情况,即不平衡标签和双重标记实例。使用 29 个基准数据集评估了所提出的方法。为了评估所提出的模型和修正方案的效率,将它们与在 LPW 集成中工作的原始二进制分类器进行了比较。使用四个不同的 ML 评估指标([Formula: see text]损失的宏平均和微平均、零一损失和汉明损失)进行了比较。此外,还研究了分类质量与 ML 数据集特征(如平均不平衡比或标签密度)之间的关系。实验研究表明,在零一损失和汉明损失方面,修正方法明显优于参考方法。