Setiono R, Liu H
Sch. of Comput., Nat. Univ. of Singapore.
IEEE Trans Syst Man Cybern B Cybern. 1999;29(3):440-4. doi: 10.1109/3477.764880.
Neural networks and decision tree methods are two common approaches to pattern classification. While neural networks can achieve high predictive accuracy rates, the decision boundaries they form are highly nonlinear and generally difficult to comprehend. Decision trees, on the other hand, can be readily translated into a set of rules. In this paper, we present a novel algorithm for generating oblique decision trees that capitalizes on the strength of both approaches. Oblique decision trees classify the patterns by testing on linear combinations of the input attributes. As a result, an oblique decision tree is usually much smaller than the univariate tree generated for the same domain. Our algorithm consists of two components: connectionist and symbolic. A three-layer feedforward neural network is constructed and pruned, a decision tree is then built from the hidden unit activation values of the pruned network. An oblique decision tree is obtained by expressing the activation values using the original input attributes. We test our algorithm on a wide range of problems. The oblique decision trees generated by the algorithm preserve the high accuracy of the neural networks, while keeping the explicitness of decision trees. Moreover, they outperform univariate decision trees generated by the symbolic approach and oblique decision trees built by other approaches in accuracy and tree size.
神经网络和决策树方法是模式分类的两种常见方法。虽然神经网络可以实现较高的预测准确率,但其形成的决策边界高度非线性且通常难以理解。另一方面,决策树可以很容易地转化为一组规则。在本文中,我们提出了一种新颖的算法来生成斜决策树,该算法利用了这两种方法的优势。斜决策树通过对输入属性的线性组合进行测试来对模式进行分类。因此,斜决策树通常比针对同一领域生成的单变量树小得多。我们的算法由两个部分组成:连接主义部分和符号部分。构建并修剪一个三层前馈神经网络,然后根据修剪后网络的隐藏单元激活值构建决策树。通过使用原始输入属性来表示激活值,从而获得斜决策树。我们在广泛的问题上测试了我们的算法。该算法生成的斜决策树在保持神经网络高准确率的同时,还保留了决策树的可解释性。此外,它们在准确率和树大小方面优于通过符号方法生成的单变量决策树以及通过其他方法构建的斜决策树。