• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于渐进采样的贝叶斯优化,用于高效自动的机器学习模型选择。

Progressive sampling-based Bayesian optimization for efficient and automatic machine learning model selection.

作者信息

Zeng Xueqiang, Luo Gang

机构信息

Computer Center, Nanchang University, 999 Xuefu Road, Nanchang, 330031 Jiangxi People's Republic of China.

Department of Biomedical Informatics and Medical Education, University of Washington, UW Medicine South Lake Union, 850 Republican Street, Building C, Box 358047, Seattle, WA 98109 USA.

出版信息

Health Inf Sci Syst. 2017 Sep 27;5(1):2. doi: 10.1007/s13755-017-0023-z. eCollection 2017 Dec.

DOI:10.1007/s13755-017-0023-z
PMID:29038732
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5617811/
Abstract

PURPOSE

Machine learning is broadly used for clinical data analysis. Before training a model, a machine learning algorithm must be selected. Also, the values of one or more model parameters termed hyper-parameters must be set. Selecting algorithms and hyper-parameter values requires advanced machine learning knowledge and many labor-intensive manual iterations. To lower the bar to machine learning, miscellaneous automatic selection methods for algorithms and/or hyper-parameter values have been proposed. Existing automatic selection methods are inefficient on large data sets. This poses a challenge for using machine learning in the clinical big data era.

METHODS

To address the challenge, this paper presents progressive sampling-based Bayesian optimization, an efficient and automatic selection method for both algorithms and hyper-parameter values.

RESULTS

We report an implementation of the method. We show that compared to a state of the art automatic selection method, our method can significantly reduce search time, classification error rate, and standard deviation of error rate due to randomization.

CONCLUSIONS

This is major progress towards enabling fast turnaround in identifying high-quality solutions required by many machine learning-based clinical data analysis tasks.

摘要

目的

机器学习广泛应用于临床数据分析。在训练模型之前,必须选择一种机器学习算法。此外,还必须设置一个或多个称为超参数的模型参数的值。选择算法和超参数值需要先进的机器学习知识以及许多劳动密集型的手动迭代。为了降低机器学习的门槛,人们提出了各种用于算法和/或超参数值的自动选择方法。现有的自动选择方法在大数据集上效率低下。这给临床大数据时代使用机器学习带来了挑战。

方法

为应对这一挑战,本文提出了基于渐进采样的贝叶斯优化方法,这是一种用于算法和超参数值的高效自动选择方法。

结果

我们报告了该方法的一个实现。我们表明,与一种先进的自动选择方法相比,我们的方法可以显著减少搜索时间、分类错误率以及由于随机化导致的错误率标准差。

结论

这是在实现快速周转以识别许多基于机器学习的临床数据分析任务所需的高质量解决方案方面取得的重大进展。

相似文献

1
Progressive sampling-based Bayesian optimization for efficient and automatic machine learning model selection.基于渐进采样的贝叶斯优化,用于高效自动的机器学习模型选择。
Health Inf Sci Syst. 2017 Sep 27;5(1):2. doi: 10.1007/s13755-017-0023-z. eCollection 2017 Dec.
2
Parameter Sensitivity Analysis for the Progressive Sampling-Based Bayesian Optimization Method for Automated Machine Learning Model Selection.用于自动化机器学习模型选择的基于渐进采样的贝叶斯优化方法的参数敏感性分析
Heterog Data Manag Polystores Anal Healthc (2020). 2021;12633:213-227. doi: 10.1007/978-3-030-71055-2_17. Epub 2021 Mar 4.
3
PredicT-ML: a tool for automating machine learning model building with big clinical data.PredicT-ML:一个利用大型临床数据自动化机器学习模型构建的工具。
Health Inf Sci Syst. 2016 Jun 8;4:5. doi: 10.1186/s13755-016-0018-1. eCollection 2016.
4
MLBCD: a machine learning tool for big clinical data.MLBCD:用于大临床数据的机器学习工具。
Health Inf Sci Syst. 2015 Sep 28;3:3. doi: 10.1186/s13755-015-0011-0. eCollection 2015.
5
Hyper-Parameter Optimization of Stacked Asymmetric Auto-Encoders for Automatic Personality Traits Perception.堆叠非对称自编码器的超参数优化用于自动人格特质感知。
Sensors (Basel). 2022 Aug 18;22(16):6206. doi: 10.3390/s22166206.
6
Intelligent Fault Diagnosis of Rotary Machinery by Convolutional Neural Network with Automatic Hyper-Parameters Tuning Using Bayesian Optimization.基于贝叶斯优化的卷积神经网络自动超参数调整的旋转机械智能故障诊断。
Sensors (Basel). 2021 Mar 31;21(7):2411. doi: 10.3390/s21072411.
7
Automating Construction of Machine Learning Models With Clinical Big Data: Proposal Rationale and Methods.利用临床大数据自动构建机器学习模型:方案原理与方法
JMIR Res Protoc. 2017 Aug 29;6(8):e175. doi: 10.2196/resprot.7757.
8
Optimizing Machine Learning Algorithms for Landslide Susceptibility Mapping along the Karakoram Highway, Gilgit Baltistan, Pakistan: A Comparative Study of Baseline, Bayesian, and Metaheuristic Hyperparameter Optimization Techniques.优化巴基斯坦吉尔吉特-巴尔蒂斯坦喀喇昆仑公路沿线滑坡易发性制图的机器学习算法:基线、贝叶斯和元启发式超参数优化技术的比较研究
Sensors (Basel). 2023 Aug 1;23(15):6843. doi: 10.3390/s23156843.
9
Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics.具有安全约束的贝叶斯优化:机器人领域中安全且自动的参数调整
Mach Learn. 2023;112(10):3713-3747. doi: 10.1007/s10994-021-06019-1. Epub 2021 Jun 24.
10
Machine Learning-Based Boosted Regression Ensemble Combined with Hyperparameter Tuning for Optimal Adaptive Learning.基于机器学习的增强回归集成与超参数调整相结合,实现最优自适应学习。
Sensors (Basel). 2022 May 16;22(10):3776. doi: 10.3390/s22103776.

引用本文的文献

1
AI-Assisted Cotton Grading: Active and Semi-Supervised Learning to Reduce the Image-Labelling Burden.人工智能辅助棉花分级:主动和半监督学习以减轻图像标注负担。
Sensors (Basel). 2023 Oct 24;23(21):8671. doi: 10.3390/s23218671.
2
Pivotal Clinical Study to Evaluate the Efficacy and Safety of Assistive Artificial Intelligence-Based Software for Cervical Cancer Diagnosis.评估基于人工智能的辅助软件用于宫颈癌诊断的疗效和安全性的关键临床研究。
J Clin Med. 2023 Jun 13;12(12):4024. doi: 10.3390/jcm12124024.
3
Multiclass characterization of frontotemporal dementia variants via multimodal brain network computational inference.通过多模态脑网络计算推理对额颞叶痴呆变体进行多类别特征描述。
Netw Neurosci. 2023 Jan 1;7(1):322-350. doi: 10.1162/netn_a_00285. eCollection 2023.
4
Multi-feature computational framework for combined signatures of dementia in underrepresented settings.多特征计算框架,用于在代表性不足的环境中联合识别痴呆症的特征。
J Neural Eng. 2022 Aug 25;19(4). doi: 10.1088/1741-2552/ac87d0.
5
Role of Artificial Intelligence Interpretation of Colposcopic Images in Cervical Cancer Screening.人工智能解读阴道镜图像在宫颈癌筛查中的作用
Healthcare (Basel). 2022 Mar 3;10(3):468. doi: 10.3390/healthcare10030468.
6
A Roadmap for Boosting Model Generalizability for Predicting Hospital Encounters for Asthma.提高哮喘住院预测模型泛化能力的路线图。
JMIR Med Inform. 2022 Mar 1;10(3):e33044. doi: 10.2196/33044.
7
Developing a Machine Learning Model to Predict Severe Chronic Obstructive Pulmonary Disease Exacerbations: Retrospective Cohort Study.开发机器学习模型预测严重慢性阻塞性肺疾病恶化:回顾性队列研究。
J Med Internet Res. 2022 Jan 6;24(1):e28953. doi: 10.2196/28953.
8
Structural and functional motor-network disruptions predict selective action-concept deficits: Evidence from frontal lobe epilepsy.结构和功能运动网络的破坏可预测选择性动作概念缺陷:来自额叶癫痫的证据。
Cortex. 2021 Nov;144:43-55. doi: 10.1016/j.cortex.2021.08.003. Epub 2021 Sep 22.
9
Predicting cognitive impairment in outpatients with epilepsy using machine learning techniques.利用机器学习技术预测门诊癫痫患者的认知障碍。
Sci Rep. 2021 Oct 8;11(1):20002. doi: 10.1038/s41598-021-99506-3.
10
Using Computational Methods to Improve Integrated Disease Management for Asthma and Chronic Obstructive Pulmonary Disease: Protocol for a Secondary Analysis.运用计算方法改善哮喘和慢性阻塞性肺疾病的综合疾病管理:二次分析方案
JMIR Res Protoc. 2021 May 18;10(5):e27065. doi: 10.2196/27065.

本文引用的文献

1
General Symptom Extraction from VA Electronic Medical Notes.从退伍军人事务部电子病历中提取一般症状
Stud Health Technol Inform. 2017;245:356-360.
2
A Roadmap for Optimizing Asthma Care Management via Computational Approaches.通过计算方法优化哮喘护理管理的路线图。
JMIR Med Inform. 2017 Sep 26;5(3):e32. doi: 10.2196/medinform.8076.
3
Automating Construction of Machine Learning Models With Clinical Big Data: Proposal Rationale and Methods.利用临床大数据自动构建机器学习模型:方案原理与方法
JMIR Res Protoc. 2017 Aug 29;6(8):e175. doi: 10.2196/resprot.7757.
4
PredicT-ML: a tool for automating machine learning model building with big clinical data.PredicT-ML:一个利用大型临床数据自动化机器学习模型构建的工具。
Health Inf Sci Syst. 2016 Jun 8;4:5. doi: 10.1186/s13755-016-0018-1. eCollection 2016.
5
Predicting Appropriate Admission of Bronchiolitis Patients in the Emergency Department: Rationale and Methods.预测急诊科毛细支气管炎患者的适当收治:基本原理与方法
JMIR Res Protoc. 2016 Mar 7;5(1):e41. doi: 10.2196/resprot.5155.
6
Using Computational Approaches to Improve Risk-Stratified Patient Management: Rationale and Methods.运用计算方法改善风险分层患者管理:基本原理与方法
JMIR Res Protoc. 2015 Oct 26;4(4):e128. doi: 10.2196/resprot.5039.
7
MLBCD: a machine learning tool for big clinical data.MLBCD:用于大临床数据的机器学习工具。
Health Inf Sci Syst. 2015 Sep 28;3:3. doi: 10.1186/s13755-015-0011-0. eCollection 2015.
8
STATISTICS. The reusable holdout: Preserving validity in adaptive data analysis.统计学。可重复保留:自适应数据分析中的有效性保持。
Science. 2015 Aug 7;349(6248):636-8. doi: 10.1126/science.aaa9375.
9
A systematic review of predictive modeling for bronchiolitis.一项关于细支气管炎预测模型的系统评价。
Int J Med Inform. 2014 Oct;83(10):691-714. doi: 10.1016/j.ijmedinf.2014.07.005. Epub 2014 Jul 24.