制造中具有全局离散化的决策树改进的进化算法。

Evolutionary Algorithm for Improving Decision Tree with Global Discretization in Manufacturing.

机构信息

Department of Industrial and Systems Engineering, Dongguk University, Seoul 04620, Korea.

出版信息

Sensors (Basel). 2021 Apr 18;21(8):2849. doi: 10.3390/s21082849.

DOI:10.3390/s21082849

PMID:33919558

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8074051/

Abstract

Due to the recent advance in the industrial Internet of Things (IoT) in manufacturing, the vast amount of data from sensors has triggered the need for leveraging such big data for fault detection. In particular, interpretable machine learning techniques, such as tree-based algorithms, have drawn attention to the need to implement reliable manufacturing systems, and identify the root causes of faults. However, despite the high interpretability of decision trees, tree-based models make a trade-off between accuracy and interpretability. In order to improve the tree's performance while maintaining its interpretability, an evolutionary algorithm for discretization of multiple attributes, called Decision tree Improved by Multiple sPLits with Evolutionary algorithm for Discretization (DIMPLED), is proposed. The experimental results with two real-world datasets from sensors showed that the decision tree improved by DIMPLED outperformed the performances of single-decision-tree models (C4.5 and CART) that are widely used in practice, and it proved competitive compared to the ensemble methods, which have multiple decision trees. Even though the ensemble methods could produce slightly better performances, the proposed DIMPLED has a more interpretable structure, while maintaining an appropriate performance level.

摘要

由于制造业中工业物联网 (IoT) 的最新进展，传感器产生的大量数据引发了利用这些大数据进行故障检测的需求。特别是可解释的机器学习技术，如基于树的算法，已经引起了人们对实施可靠制造系统和识别故障根本原因的关注。然而，尽管决策树具有很高的可解释性，但基于树的模型在准确性和可解释性之间进行了权衡。为了在保持可解释性的同时提高树的性能，提出了一种用于多属性离散化的进化算法，称为多分裂决策树改进的进化算法（DIMPLED）。使用来自传感器的两个真实数据集的实验结果表明，DIMPLED 改进后的决策树优于在实践中广泛使用的单决策树模型（C4.5 和 CART）的性能，并且与具有多个决策树的集成方法相比具有竞争力。尽管集成方法可以产生稍微更好的性能，但所提出的 DIMPLED 具有更具可解释性的结构，同时保持适当的性能水平。