Tetteh Evans Teiko, Zielosko Beata
Institute of Computer Science, University of Silesia in Katowice, Bȩdzińska 39, 41-200 Sosnowiec, Poland.
Entropy (Basel). 2025 Jan 4;27(1):35. doi: 10.3390/e27010035.
This study introduces a greedy algorithm for deriving decision rules from decision tree ensembles, targeting enhanced interpretability and generalization in distributed data environments. Decision rules, known for their transparency, provide an accessible method for knowledge extraction from data, facilitating decision-making processes across diverse fields. Traditional decision tree algorithms, such as CART and ID3, are employed to induce decision trees from bootstrapped datasets, which represent distributed data sources. Subsequently, a greedy algorithm is applied to derive decision rules that are true across multiple decision trees. Experiments are performed, taking into account knowledge representation and discovery perspectives. They show that, as the value of α, 0≤α<1, increases, shorter rules are obtained, and also it is possible to improve the classification accuracy of rule-based models.
本研究介绍了一种用于从决策树集成中推导决策规则的贪心算法,旨在增强分布式数据环境中的可解释性和泛化能力。决策规则因其透明度而闻名,它提供了一种从数据中提取知识的可访问方法,有助于跨不同领域的决策过程。传统的决策树算法,如CART和ID3,用于从自举数据集中诱导决策树,这些数据集代表分布式数据源。随后,应用贪心算法来推导在多个决策树中都成立的决策规则。从知识表示和发现的角度进行了实验。结果表明,随着α(0≤α<1)的值增加,可以获得更短的规则,并且还可以提高基于规则模型的分类准确率。