Department of Biostatistics, Epidemiology and Informatics, Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA 19104, USA.
Bioinformatics. 2021 Apr 19;37(2):282-284. doi: 10.1093/bioinformatics/btaa662.
treeheatr is an R package for creating interpretable decision tree visualizations with the data represented as a heatmap at the tree's leaf nodes. The integrated presentation of the tree structure along with an overview of the data efficiently illustrates how the tree nodes split up the feature space and how well the tree model performs. This visualization can also be examined in depth to uncover the correlation structure in the data and importance of each feature in predicting the outcome. Implemented in an easily installed package with a detailed vignette, treeheatr can be a useful teaching tool to enhance students' understanding of a simple decision tree model before diving into more complex tree-based machine learning methods.
The treeheatr package is freely available under the permissive MIT license at https://trang1618.github.io/treeheatr and https://cran.r-project.org/package=treeheatr. It comes with a detailed vignette that is automatically built with GitHub Actions continuous integration.
treeheatr 是一个 R 包,用于创建可解释的决策树可视化,数据以树的叶节点处的热图表示。树结构的集成表示以及数据概述有效地说明了树节点如何划分特征空间以及树模型的性能如何。还可以深入检查此可视化以揭示数据中的相关结构以及每个特征在预测结果中的重要性。treeheatr 以易于安装的软件包实现,并附有详细的说明,它可以作为一个有用的教学工具,帮助学生在深入研究更复杂的基于树的机器学习方法之前,加深对简单决策树模型的理解。
treeheatr 软件包在 permissive MIT 许可证下免费提供,网址为 https://trang1618.github.io/treeheatr 和 https://cran.r-project.org/package=treeheatr。它附有一个详细的说明,该说明是通过 GitHub Actions 持续集成自动生成的。