• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于梯度的超参数优化。

Gradient-based optimization of hyperparameters.

作者信息

Bengio Y

机构信息

Département d'informatique et recherche opérationnelle, Université de Montréal, Montréal, Québec, Canada, H3C 3J7.

出版信息

Neural Comput. 2000 Aug;12(8):1889-900. doi: 10.1162/089976600300015187.

DOI:10.1162/089976600300015187
PMID:10953243
Abstract

Many machine learning algorithms can be formulated as the minimization of a training criterion that involves a hyperparameter. This hyperparameter is usually chosen by trial and error with a model selection criterion. In this article we present a methodology to optimize several hyperparameters, based on the computation of the gradient of a model selection criterion with respect to the hyperparameters. In the case of a quadratic training criterion, the gradient of the selection criterion with respect to the hyperparameters is efficiently computed by backpropagating through a Cholesky decomposition. In the more general case, we show that the implicit function theorem can be used to derive a formula for the hyperparameter gradient involving second derivatives of the training criterion.

摘要

许多机器学习算法都可以表述为对一个涉及超参数的训练准则进行最小化。这个超参数通常通过基于模型选择准则的反复试验来选择。在本文中,我们提出了一种基于模型选择准则相对于超参数的梯度计算来优化多个超参数的方法。在二次训练准则的情况下,通过对乔列斯基分解进行反向传播,可以有效地计算选择准则相对于超参数的梯度。在更一般的情况下,我们表明隐函数定理可用于推导一个涉及训练准则二阶导数的超参数梯度公式。

相似文献

1
Gradient-based optimization of hyperparameters.基于梯度的超参数优化。
Neural Comput. 2000 Aug;12(8):1889-900. doi: 10.1162/089976600300015187.
2
Efficient gradient computation for optimization of hyperparameters.用于超参数优化的高效梯度计算。
Phys Med Biol. 2022 Feb 16;67(3). doi: 10.1088/1361-6560/ac4442.
3
Gradient based hyperparameter optimization in Echo State Networks.基于梯度的回声状态网络中的超参数优化。
Neural Netw. 2019 Jul;115:23-29. doi: 10.1016/j.neunet.2019.02.001. Epub 2019 Mar 8.
4
Integrated Evolutionary Learning: An Artificial Intelligence Approach to Joint Learning of Features and Hyperparameters for Optimized, Explainable Machine Learning.集成进化学习:一种用于特征和超参数联合学习以实现优化、可解释机器学习的人工智能方法。
Front Artif Intell. 2022 Apr 5;5:832530. doi: 10.3389/frai.2022.832530. eCollection 2022.
5
Hyperparameter Selection超参数选择
6
SVM Modeling via a Hybrid Genetic Strategy. A Health Care Application.基于混合遗传策略的支持向量机建模。一项医疗保健应用。
Stud Health Technol Inform. 2005;116:193-8.
7
The Q-norm complexity measure and the minimum gradient method: a novel approach to the machine learning structural risk minimization problem.Q范数复杂度度量与最小梯度法:一种解决机器学习结构风险最小化问题的新方法。
IEEE Trans Neural Netw. 2008 Aug;19(8):1415-30. doi: 10.1109/TNN.2008.2000442.
8
Hyperparameter Optimization Techniques for Designing Software Sensors Based on Artificial Neural Networks.基于人工神经网络的软件传感器设计的超参数优化技术。
Sensors (Basel). 2021 Dec 17;21(24):8435. doi: 10.3390/s21248435.
9
Heuristic hyperparameter optimization of deep learning models for genomic prediction.启发式深度学习模型的基因组预测超参数优化。
G3 (Bethesda). 2021 Jul 14;11(7). doi: 10.1093/g3journal/jkab032.
10
Hyperparameter selection for dataset-constrained semantic segmentation: Practical machine learning optimization.数据集受限语义分割的超参数选择:实用机器学习优化
J Appl Clin Med Phys. 2024 Dec;25(12):e14542. doi: 10.1002/acm2.14542. Epub 2024 Oct 10.

引用本文的文献

1
Advancing computational evaluation of adsorption via porous materials by artificial intelligence and computational fluid dynamics.通过人工智能和计算流体动力学推进对多孔材料吸附的计算评估。
Sci Rep. 2025 Aug 13;15(1):29691. doi: 10.1038/s41598-025-15538-z.
2
A Hands-On Introduction to Data Analytics for Biomedical Research.生物医学研究数据分析实践入门
Function (Oxf). 2025 Mar 24;6(2). doi: 10.1093/function/zqaf015.
3
Offline Reward Perturbation Boosts Distributional Shift in Online RL.离线奖励扰动增强在线强化学习中的分布转移
Uncertain Artif Intell. 2024 Jul;2024.
4
Deep learning-based classification of the capillary ultrastructure in human skeletal muscles.基于深度学习的人体骨骼肌毛细血管超微结构分类
Front Mol Biosci. 2024 May 1;11:1363384. doi: 10.3389/fmolb.2024.1363384. eCollection 2024.
5
The cortical representation of language timescales is shared between reading and listening.语言的皮质代表时间尺度在阅读和听力之间是共享的。
Commun Biol. 2024 Mar 7;7(1):284. doi: 10.1038/s42003-024-05909-z.
6
Physics-Informed Neural Networks for the Condition Monitoring of Rotating Shafts.用于旋转轴状态监测的物理信息神经网络
Sensors (Basel). 2023 Dec 29;24(1):207. doi: 10.3390/s24010207.
7
The Cortical Representation of Language Timescales is Shared between Reading and Listening.语言时间尺度的皮层表征在阅读和听力之间是共享的。
bioRxiv. 2023 Dec 11:2023.01.06.522601. doi: 10.1101/2023.01.06.522601.
8
Comparison of pre-trained language models in terms of carbon emissions, time and accuracy in multi-label text classification using AutoML.使用自动机器学习(AutoML)在多标签文本分类中,对预训练语言模型在碳排放、时间和准确性方面的比较。
Heliyon. 2023 May 1;9(5):e15670. doi: 10.1016/j.heliyon.2023.e15670. eCollection 2023 May.
9
A Survey on Optimization Techniques for Edge Artificial Intelligence (AI).边缘人工智能(AI)优化技术研究综述
Sensors (Basel). 2023 Jan 22;23(3):1279. doi: 10.3390/s23031279.
10
Feature-space selection with banded ridge regression.带脊岭回归的特征空间选择。
Neuroimage. 2022 Dec 1;264:119728. doi: 10.1016/j.neuroimage.2022.119728. Epub 2022 Nov 8.