Gao Kangkai, Wang Yong, Ma Liyao
Department of Automation, University of Science and Technology of China, Hefei 230027, China.
School of Electrical Engineering, University of Jinan, Jinan 250022, China.
Entropy (Basel). 2022 Apr 26;24(5):605. doi: 10.3390/e24050605.
As well-known machine learning methods, decision trees are widely applied in classification and recognition areas. In this paper, with the uncertainty of labels handled by belief functions, a new decision tree method based on belief entropy is proposed and then extended to random forest. With the Gaussian mixture model, this tree method is able to deal with continuous attribute values directly, without pretreatment of discretization. Specifically, the tree method adopts belief entropy, a kind of uncertainty measurement based on the basic belief assignment, as a new attribute selection tool. To improve the classification performance, we constructed a random forest based on the basic trees and discuss different prediction combination strategies. Some numerical experiments on UCI machine learning data set were conducted, which indicate the good classification accuracy of the proposed method in different situations, especially on data with huge uncertainty.
作为众所周知的机器学习方法,决策树在分类和识别领域得到了广泛应用。本文针对标签的不确定性,利用信度函数进行处理,提出了一种基于信度熵的新型决策树方法,并将其扩展到随机森林。借助高斯混合模型,该树方法能够直接处理连续属性值,无需进行离散化预处理。具体而言,该树方法采用基于基本信度分配的不确定性度量——信度熵,作为一种新的属性选择工具。为提高分类性能,我们基于基本树构建了随机森林,并讨论了不同的预测组合策略。在UCI机器学习数据集上进行了一些数值实验,结果表明所提方法在不同情况下具有良好的分类精度,尤其是在具有巨大不确定性的数据上。