Suppr超能文献

在自动化机器学习的上下文中预测机器学习管道的运行时间。

Predicting Machine Learning Pipeline Runtimes in the Context of Automated Machine Learning.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2021 Sep;43(9):3055-3066. doi: 10.1109/TPAMI.2021.3056950. Epub 2021 Aug 4.

Abstract

Automated machine learning (AutoML) seeks to automatically find so-called machine learning pipelines that maximize the prediction performance when being used to train a model on a given dataset. One of the main and yet open challenges in AutoMLis an effective use of computational resources: An AutoML process involves the evaluation of many candidate pipelines, which are costly but often ineffective because they are canceled due to a timeout. In this paper, we present an approach to predict the runtime of two-step machine learning pipelines with up to one pre-processor, which can be used to anticipate whether or not a pipeline will time out. Separate runtime models are trained offline for each algorithm that may be used in a pipeline, and an overall prediction is derived from these models. We empirically show that the approach increases successful evaluations made by an AutoML tool while preserving or even improving on the previously best solutions.

摘要

自动化机器学习(AutoML)旨在自动找到所谓的机器学习管道,当用于在给定数据集上训练模型时,这些管道可以最大限度地提高预测性能。AutoML 中的一个主要且尚未解决的挑战是有效地利用计算资源:AutoML 过程涉及对许多候选管道的评估,这些评估代价高昂,但往往效果不佳,因为它们由于超时而被取消。在本文中,我们提出了一种方法来预测具有至多一个预处理步骤的两步式机器学习管道的运行时,这可用于预测管道是否会超时。为可能在管道中使用的每个算法分别离线训练运行时模型,并从这些模型中得出总体预测。我们通过经验证明,该方法在保留甚至改进之前的最佳解决方案的同时,增加了 AutoML 工具的成功评估次数。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验