• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过终身知识锚点实现不断进化的全自动机器学习。

Evolving Fully Automated Machine Learning via Life-Long Knowledge Anchors.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2021 Sep;43(9):3091-3107. doi: 10.1109/TPAMI.2021.3069250. Epub 2021 Aug 4.

DOI:10.1109/TPAMI.2021.3069250
PMID:33780333
Abstract

Automated machine learning (AutoML) has achieved remarkable progress on various tasks, which is attributed to its minimal involvement of manual feature and model designs. However, most of existing AutoML pipelines only touch parts of the full machine learning pipeline, e.g., neural architecture search or optimizer selection. This leaves potentially important components such as data cleaning and model ensemble out of the optimization, and still results in considerable human involvement and suboptimal performance. The main challenges lie in the huge search space assembling all possibilities over all components, as well as the generalization ability over different tasks like image, text, and tabular etc. In this paper, we present a first-of-its-kind fully AutoML pipeline, to comprehensively automate data preprocessing, feature engineering, model generation/selection/training and ensemble for an arbitrary dataset and evaluation metric. Our innovation lies in the comprehensive scope of a learning pipeline, with a novel "life-long" knowledge anchor design to fundamentally accelerate the search over the full search space. Such knowledge anchors record detailed information of pipelines and integrates them with an evolutionary algorithm for joint optimization across components. Experiments demonstrate that the result pipeline achieves state-of-the-art performance on multiple datasets and modalities. Specifically, the proposed framework was extensively evaluated in the NeurIPS 2019 AutoDL challenge, and won the only champion with a significant gap against other approaches, on all the image, video, speech, text and tabular tracks.

摘要

自动化机器学习(AutoML)在各种任务上取得了显著的进展,这归因于它在手动特征和模型设计方面的最小干预。然而,现有的大多数 AutoML 管道只触及了完整机器学习管道的部分内容,例如神经架构搜索或优化器选择。这使得数据清理和模型集成等潜在重要组件无法进行优化,并且仍然需要大量的人工参与和次优的性能。主要的挑战在于组装所有组件的所有可能性的巨大搜索空间,以及在不同任务(如图像、文本和表格等)上的泛化能力。在本文中,我们提出了一种首创的全自动化机器学习管道,以全面自动化任意数据集和评估指标的数据预处理、特征工程、模型生成/选择/训练和集成。我们的创新在于学习管道的全面范围,采用新颖的“终身”知识锚设计,从根本上加速整个搜索空间的搜索。这些知识锚记录了管道的详细信息,并将其与进化算法集成,实现组件之间的联合优化。实验表明,该结果管道在多个数据集和模态上实现了最先进的性能。具体来说,该框架在 2019 年神经信息处理系统大会(NeurIPS)的 AutoDL 挑战赛中进行了广泛评估,并在所有图像、视频、语音、文本和表格赛道上以显著优势击败了其他方法,获得了唯一的冠军。

相似文献

1
Evolving Fully Automated Machine Learning via Life-Long Knowledge Anchors.通过终身知识锚点实现不断进化的全自动机器学习。
IEEE Trans Pattern Anal Mach Intell. 2021 Sep;43(9):3091-3107. doi: 10.1109/TPAMI.2021.3069250. Epub 2021 Aug 4.
2
Auto-Pytorch: Multi-Fidelity MetaLearning for Efficient and Robust AutoDL.Auto-Pytorch:用于高效稳健 AutoDL 的多保真度元学习。
IEEE Trans Pattern Anal Mach Intell. 2021 Sep;43(9):3079-3090. doi: 10.1109/TPAMI.2021.3067763. Epub 2021 Aug 4.
3
Human behavior in image-based Road Health Inspection Systems despite the emerging AutoML.尽管出现了自动化机器学习,但基于图像的道路健康检测系统中的人类行为。
J Big Data. 2022;9(1):96. doi: 10.1186/s40537-022-00646-8. Epub 2022 Jul 20.
4
AutoML for Multi-Label Classification: Overview and Empirical Evaluation.用于多标签分类的自动机器学习:概述与实证评估。
IEEE Trans Pattern Anal Mach Intell. 2021 Sep;43(9):3037-3054. doi: 10.1109/TPAMI.2021.3051276. Epub 2021 Aug 4.
5
Predicting Machine Learning Pipeline Runtimes in the Context of Automated Machine Learning.在自动化机器学习的上下文中预测机器学习管道的运行时间。
IEEE Trans Pattern Anal Mach Intell. 2021 Sep;43(9):3055-3066. doi: 10.1109/TPAMI.2021.3056950. Epub 2021 Aug 4.
6
Automated machine learning: Review of the state-of-the-art and opportunities for healthcare.自动化机器学习:最新技术综述及医疗保健领域的机遇
Artif Intell Med. 2020 Apr;104:101822. doi: 10.1016/j.artmed.2020.101822. Epub 2020 Feb 21.
7
Advances in neural architecture search.神经架构搜索的进展。
Natl Sci Rev. 2024 Aug 23;11(8):nwae282. doi: 10.1093/nsr/nwae282. eCollection 2024 Aug.
8
Adaptation Strategies for Automated Machine Learning on Evolving Data.适应不断进化数据的自动化机器学习策略。
IEEE Trans Pattern Anal Mach Intell. 2021 Sep;43(9):3067-3078. doi: 10.1109/TPAMI.2021.3062900. Epub 2021 Aug 4.
9
Benchmarking AutoML for regression tasks on small tabular data in materials design.在材料设计中针对小表格数据上的回归任务对自动化机器学习进行基准测试。
Sci Rep. 2022 Nov 11;12(1):19350. doi: 10.1038/s41598-022-23327-1.
10
PipelineProfiler: A Visual Analytics Tool for the Exploration of AutoML Pipelines.管道剖析器:一种用于探索自动机器学习管道的可视化分析工具。
IEEE Trans Vis Comput Graph. 2021 Feb;27(2):390-400. doi: 10.1109/TVCG.2020.3030361. Epub 2021 Jan 28.

引用本文的文献

1
Adaptive habitat biogeography-based optimizer for optimizing deep CNN hyperparameters in image classification.基于自适应栖息地生物地理学的优化器,用于优化图像分类中的深度卷积神经网络超参数
Heliyon. 2024 Mar 21;10(7):e28147. doi: 10.1016/j.heliyon.2024.e28147. eCollection 2024 Apr 15.
2
Human behavior in image-based Road Health Inspection Systems despite the emerging AutoML.尽管出现了自动化机器学习,但基于图像的道路健康检测系统中的人类行为。
J Big Data. 2022;9(1):96. doi: 10.1186/s40537-022-00646-8. Epub 2022 Jul 20.