Suppr超能文献

通过参考集成实现快速稀疏决策树优化

Fast Sparse Decision Tree Optimization via Reference Ensembles.

作者信息

McTavish Hayden, Zhong Chudi, Achermann Reto, Karimalis Ilias, Chen Jacques, Rudin Cynthia, Seltzer Margo

机构信息

University of British Columbia.

University of California, San Diego.

出版信息

Proc AAAI Conf Artif Intell. 2022;36(9):9604-9613. doi: 10.1609/aaai.v36i9.21194. Epub 2022 Jun 28.

Abstract

Sparse decision tree optimization has been one of the most fundamental problems in AI since its inception and is a challenge at the core of interpretable machine learning. Sparse decision tree optimization is computationally hard, and despite steady effort since the 1960's, breakthroughs have been made on the problem only within the past few years, primarily on the problem of finding optimal sparse decision trees. However, current state-of-the-art algorithms often require impractical amounts of computation time and memory to find optimal or near-optimal trees for some real-world datasets, particularly those having several continuous-valued features. Given that the search spaces of these decision tree optimization problems are massive, can we practically hope to find a sparse decision tree that competes in accuracy with a black box machine learning model? We address this problem via smart guessing strategies that can be applied to any optimal branch-and-bound-based decision tree algorithm. The guesses come from knowledge gleaned from black box models. We show that by using these guesses, we can reduce the run time by multiple orders of magnitude while providing bounds on how far the resulting trees can deviate from the black box's accuracy and expressive power. Our approach enables guesses about how to bin continuous features, the size of the tree, and lower bounds on the error for the optimal decision tree. Our experiments show that in many cases we can rapidly construct sparse decision trees that match the accuracy of black box models. To summarize: , .

摘要

自人工智能诞生以来,稀疏决策树优化一直是其最基本的问题之一,也是可解释机器学习核心的一项挑战。稀疏决策树优化在计算上很困难,尽管自20世纪60年代以来一直在持续努力,但直到过去几年才在该问题上取得突破,主要是在寻找最优稀疏决策树的问题上。然而,对于一些现实世界的数据集,特别是那些具有多个连续值特征的数据集,当前最先进的算法通常需要大量不切实际的计算时间和内存来找到最优或接近最优的树。鉴于这些决策树优化问题的搜索空间巨大,我们真的能期望找到一个在准确性上能与黑箱机器学习模型相媲美的稀疏决策树吗?我们通过可应用于任何基于最优分支定界的决策树算法的智能猜测策略来解决这个问题。这些猜测来自从黑箱模型中收集的知识。我们表明,通过使用这些猜测,我们可以将运行时间减少多个数量级,同时给出所得树与黑箱的准确性和表达能力的偏差程度的界限。我们的方法能够对如何对连续特征进行分箱、树的大小以及最优决策树的误差下界进行猜测。我们的实验表明,在许多情况下,我们可以快速构建出与黑箱模型准确性相匹配的稀疏决策树。总结如下: , 。

相似文献

1
Fast Sparse Decision Tree Optimization via Reference Ensembles.通过参考集成实现快速稀疏决策树优化
Proc AAAI Conf Artif Intell. 2022;36(9):9604-9613. doi: 10.1609/aaai.v36i9.21194. Epub 2022 Jun 28.
5
Optimal Sparse Regression Trees.最优稀疏回归树
Proc AAAI Conf Artif Intell. 2023 Jun;37(9):11270-11279. doi: 10.1609/aaai.v37i9.26334.

引用本文的文献

4
Optimal Sparse Survival Trees.最优稀疏生存树
Proc Mach Learn Res. 2024 May;238:352-360.
6
Optimal Sparse Regression Trees.最优稀疏回归树
Proc AAAI Conf Artif Intell. 2023 Jun;37(9):11270-11279. doi: 10.1609/aaai.v37i9.26334.
8
Decision trees: from efficient prediction to responsible AI.决策树:从高效预测到负责任的人工智能
Front Artif Intell. 2023 Jul 26;6:1124553. doi: 10.3389/frai.2023.1124553. eCollection 2023.

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验