Jiao Lianmeng, Geng Xiaojiao, Pan Quan
School of Automation, Northwestern Polytechnical University, Xi'an 710072, China.
Entropy (Basel). 2019 Apr 28;21(5):443. doi: 10.3390/e21050443.
The belief rule-based classification system (BRBCS) is a promising technique for addressing different types of uncertainty in complex classification problems, by introducing the belief function theory into the classical fuzzy rule-based classification system. However, in the BRBCS, high numbers of instances and features generally induce a belief rule base (BRB) with large size, which degrades the interpretability of the classification model for big data sets. In this paper, a BRB learning method based on the evidential C-means clustering (ECM) algorithm is proposed to efficiently design a compact belief rule-based classification system (CBRBCS). First, a supervised version of the ECM algorithm is designed by means of weighted product-space clustering to partition the training set with the goals of obtaining both good inter-cluster separability and inner-cluster pureness. Then, a systematic method is developed to construct belief rules based on the obtained credal partitions. Finally, an evidential partition entropy-based optimization procedure is designed to get a compact BRB with a better trade-off between accuracy and interpretability. The key benefit of the proposed CBRBCS is that it can provide a more interpretable classification model on the premise of comparative accuracy. Experiments based on synthetic and real data sets have been conducted to evaluate the classification accuracy and interpretability of the proposal.
基于置信规则的分类系统(BRBCS)是一种很有前景的技术,通过将置信函数理论引入经典的基于模糊规则的分类系统,来解决复杂分类问题中不同类型的不确定性。然而,在BRBCS中,大量的实例和特征通常会导致生成一个规模较大的置信规则库(BRB),这会降低大数据集分类模型的可解释性。本文提出了一种基于证据C均值聚类(ECM)算法的BRB学习方法,以有效地设计一个紧凑的基于置信规则的分类系统(CBRBCS)。首先,通过加权乘积空间聚类设计了一种ECM算法的监督版本,对训练集进行划分,目标是获得良好的类间可分性和类内纯度。然后,开发了一种系统方法,基于获得的信任划分来构建置信规则。最后,设计了一种基于证据划分熵的优化过程,以获得一个紧凑的BRB,在准确性和可解释性之间取得更好的平衡。所提出的CBRBCS的关键优势在于,它能够在保证相当准确性的前提下,提供一个更具可解释性的分类模型。已基于合成数据集和真实数据集进行了实验,以评估该方法的分类准确性和可解释性。