Suppr超能文献

基于约束增益和深度最优的决策树属性选择

Attribute Selection Based on Constraint Gain and Depth Optimal for a Decision Tree.

作者信息

Sun Huaining, Hu Xuegang, Zhang Yuhong

机构信息

School of Computer Science, Huainan Normal University, Huainan 232038, China.

School of Computer and Information, Hefei University and Technology, Hefei 230009, China.

出版信息

Entropy (Basel). 2019 Feb 19;21(2):198. doi: 10.3390/e21020198.

Abstract

Uncertainty evaluation based on statistical probabilistic information entropy is a commonly used mechanism for a heuristic method construction of decision tree learning. The entropy kernel potentially links its deviation and decision tree classification performance. This paper presents a decision tree learning algorithm based on constrained gain and depth induction optimization. Firstly, the calculation and analysis of single- and multi-value event uncertainty distributions of information entropy is followed by an enhanced property of single-value event entropy kernel and multi-value event entropy peaks as well as a reciprocal relationship between peak location and the number of possible events. Secondly, this study proposed an estimated method for information entropy whose entropy kernel is replaced with a peak-shift sine function to establish a decision tree learning (CGDT) algorithm on the basis of constraint gain. Finally, by combining branch convergence and fan-out indices under an inductive depth of a decision tree, we built a constraint gained and depth inductive improved decision tree (CGDIDT) learning algorithm. Results show the benefits of the CGDT and CGDIDT algorithms.

摘要

基于统计概率信息熵的不确定性评估是一种常用于启发式决策树学习方法构建的机制。熵核潜在地关联着其偏差与决策树分类性能。本文提出了一种基于约束增益和深度归纳优化的决策树学习算法。首先,对信息熵的单值和多值事件不确定性分布进行计算与分析,随后得出单值事件熵核和多值事件熵峰的增强特性以及峰位置与可能事件数量之间的倒数关系。其次,本研究提出了一种信息熵估计方法,该方法用峰移正弦函数替换熵核,以在约束增益的基础上建立决策树学习(CGDT)算法。最后,通过结合决策树归纳深度下的分支收敛和扇出指数,构建了一种约束增益和深度归纳改进的决策树(CGDIDT)学习算法。结果显示了CGDT和CGDIDT算法的优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ee1/7514679/85793d7f1f2e/entropy-21-00198-g001.jpg

相似文献

1
Attribute Selection Based on Constraint Gain and Depth Optimal for a Decision Tree.
Entropy (Basel). 2019 Feb 19;21(2):198. doi: 10.3390/e21020198.
2
Entropy based C4.5-SHO algorithm with information gain optimization in data mining.
PeerJ Comput Sci. 2021 Apr 7;7:e424. doi: 10.7717/peerj-cs.424. eCollection 2021.
6
An Attribute Reduction Method Using Neighborhood Entropy Measures in Neighborhood Rough Sets.
Entropy (Basel). 2019 Feb 7;21(2):155. doi: 10.3390/e21020155.
8
Entropy-based kernel mixture modeling for topographic map formation.
IEEE Trans Neural Netw. 2004 Jul;15(4):850-8. doi: 10.1109/TNN.2004.828763.
10
Entropy-constrained halftoning using multipath tree coding.
IEEE Trans Image Process. 1997;6(11):1567-79. doi: 10.1109/83.641416.

本文引用的文献

2
Physiological time-series analysis using approximate entropy and sample entropy.
Am J Physiol Heart Circ Physiol. 2000 Jun;278(6):H2039-49. doi: 10.1152/ajpheart.2000.278.6.H2039.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验