• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于概率的模型排名方法:一种机器学习模型性能评估的替代方法。

A Probability-Based Models Ranking Approach: An Alternative Method of Machine-Learning Model Performance Assessment.

机构信息

Faculty of Economic Sciences, University of Warsaw, Długa Street 44/50, 00-241 Warsaw, Poland.

出版信息

Sensors (Basel). 2022 Aug 24;22(17):6361. doi: 10.3390/s22176361.

DOI:10.3390/s22176361
PMID:36080820
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9460558/
Abstract

Performance measures are crucial in selecting the best machine learning model for a given problem. Estimating classical model performance measures by subsampling methods like bagging or cross-validation has several weaknesses. The most important ones are the inability to test the significance of the difference, and the lack of interpretability. Recently proposed Elo-based Predictive Power (EPP)-a meta-measure of machine learning model performance, is an attempt to address these weaknesses. However, the EPP is based on wrong assumptions, so its estimates may not be correct. This paper introduces the Probability-based Ranking Model Approach (PMRA), which is a modified EPP approach with a correction that makes its estimates more reliable. PMRA is based on the calculation of the probability that one model achieves a better result than another one, using the Mixed Effects Logistic Regression model. The empirical analysis was carried out on a real mortgage credits dataset. The analysis included a comparison of how the PMRA and state-of-the-art k-fold cross-validation ranked the 49 machine learning models, an example application of a novel method in hyperparameters tuning problem, and a comparison of PMRA and EPP indications. PMRA gives the opportunity to compare a newly developed algorithm to state-of-the-art algorithms based on statistical criteria. It is the solution to select the best hyperparameters configuration and to formulate criteria for the continuation of the hyperparameters space search.

摘要

性能指标对于为给定问题选择最佳机器学习模型至关重要。通过套袋或交叉验证等抽样方法来估计经典模型的性能指标存在几个弱点。最重要的是无法测试差异的显著性,并且缺乏可解释性。最近提出的基于 Elo 的预测能力 (EPP)——一种机器学习模型性能的综合指标,试图解决这些弱点。然而,EPP 基于错误的假设,因此其估计可能不正确。本文介绍了基于概率的排名模型方法 (PMRA),这是一种改进的 EPP 方法,通过修正使其估计更加可靠。PMRA 基于使用混合效应逻辑回归模型计算一个模型比另一个模型获得更好结果的概率。实证分析是在真实的抵押贷款信用数据集上进行的。分析包括比较 PMRA 和最先进的 k 折交叉验证如何对 49 个机器学习模型进行排名,展示一种新方法在超参数调优问题中的应用实例,以及比较 PMRA 和 EPP 指标。PMRA 提供了一种机会,可以根据统计标准将新开发的算法与最先进的算法进行比较。它是选择最佳超参数配置的解决方案,并为超参数空间搜索的继续制定标准。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/d45d82a53787/sensors-22-06361-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/1013583abdfb/sensors-22-06361-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/9ce9963a9bdc/sensors-22-06361-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/955fc97fdf27/sensors-22-06361-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/d1e42d694e4f/sensors-22-06361-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/d9080b28330b/sensors-22-06361-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/5971b4cf77d2/sensors-22-06361-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/9a107b684b51/sensors-22-06361-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/65232df4b08d/sensors-22-06361-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/b56bb465784f/sensors-22-06361-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/07a67cc5242b/sensors-22-06361-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/bd3039d0d288/sensors-22-06361-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/1e01410ed87e/sensors-22-06361-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/d45d82a53787/sensors-22-06361-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/1013583abdfb/sensors-22-06361-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/9ce9963a9bdc/sensors-22-06361-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/955fc97fdf27/sensors-22-06361-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/d1e42d694e4f/sensors-22-06361-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/d9080b28330b/sensors-22-06361-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/5971b4cf77d2/sensors-22-06361-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/9a107b684b51/sensors-22-06361-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/65232df4b08d/sensors-22-06361-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/b56bb465784f/sensors-22-06361-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/07a67cc5242b/sensors-22-06361-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/bd3039d0d288/sensors-22-06361-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/1e01410ed87e/sensors-22-06361-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cf9b/9460558/d45d82a53787/sensors-22-06361-g013.jpg

相似文献

1
A Probability-Based Models Ranking Approach: An Alternative Method of Machine-Learning Model Performance Assessment.基于概率的模型排名方法:一种机器学习模型性能评估的替代方法。
Sensors (Basel). 2022 Aug 24;22(17):6361. doi: 10.3390/s22176361.
2
Tuning hyperparameters of machine learning algorithms and deep neural networks using metaheuristics: A bioinformatics study on biomedical and biological cases.使用元启发式算法调整机器学习算法和深度神经网络的超参数:生物信息学在生物医学和生物学案例中的研究。
Comput Biol Chem. 2022 Apr;97:107619. doi: 10.1016/j.compbiolchem.2021.107619. Epub 2021 Dec 24.
3
A comparison of machine learning and logistic regression in modelling the association of body condition score and submission rate.机器学习与逻辑回归在构建体况评分与提交率关联模型中的比较。
Prev Vet Med. 2019 Nov 1;171:104765. doi: 10.1016/j.prevetmed.2019.104765. Epub 2019 Aug 31.
4
[Machine learning for predictive analyses in health: an example of an application to predict death in the elderly in São Paulo, Brazil].[用于健康预测分析的机器学习:以巴西圣保罗老年人死亡预测应用为例]
Cad Saude Publica. 2019 Jul 29;35(7):e00050818. doi: 10.1590/0102-311X00050818.
5
AutoML-ID: automated machine learning model for intrusion detection using wireless sensor network.AutoML-ID:使用无线传感器网络进行入侵检测的自动化机器学习模型。
Sci Rep. 2022 May 31;12(1):9074. doi: 10.1038/s41598-022-13061-z.
6
Integrated Evolutionary Learning: An Artificial Intelligence Approach to Joint Learning of Features and Hyperparameters for Optimized, Explainable Machine Learning.集成进化学习:一种用于特征和超参数联合学习以实现优化、可解释机器学习的人工智能方法。
Front Artif Intell. 2022 Apr 5;5:832530. doi: 10.3389/frai.2022.832530. eCollection 2022.
7
Application of Machine Learning to Child Mode Choice with a Novel Technique to Optimize Hyperparameters.应用机器学习优化超参数,实现儿童出行模式选择。
Int J Environ Res Public Health. 2022 Dec 15;19(24):16844. doi: 10.3390/ijerph192416844.
8
Evaluation of multiple prediction models: A novel view on model selection and performance assessment.多种预测模型的评估:关于模型选择和性能评估的新观点。
Stat Methods Med Res. 2020 Jun;29(6):1728-1745. doi: 10.1177/0962280219854487. Epub 2019 Sep 12.
9
Can Hyperparameter Tuning Improve the Performance of a Super Learner?: A Case Study.超参数调优能否提高超级学习者的性能?:一项案例研究。
Epidemiology. 2019 Jul;30(4):521-531. doi: 10.1097/EDE.0000000000001027.
10
Algorithm Recommendation and Performance Prediction Using Meta-Learning.基于元学习的算法推荐与性能预测。
Int J Neural Syst. 2023 Mar;33(3):2350011. doi: 10.1142/S0129065723500119. Epub 2023 Feb 1.

引用本文的文献

1
Deep-PK: deep learning for small molecule pharmacokinetic and toxicity prediction.深度药代动力学:小分子药代动力学和毒性预测的深度学习。
Nucleic Acids Res. 2024 Jul 5;52(W1):W469-W475. doi: 10.1093/nar/gkae254.
2
Applications of machine learning in metabolomics: Disease modeling and classification.机器学习在代谢组学中的应用:疾病建模与分类。
Front Genet. 2022 Nov 24;13:1017340. doi: 10.3389/fgene.2022.1017340. eCollection 2022.

本文引用的文献

1
Ordered quantile normalization: a semiparametric transformation built for the cross-validation era.有序分位数归一化:一种为交叉验证时代构建的半参数变换。
J Appl Stat. 2019 Jun 15;47(13-15):2312-2327. doi: 10.1080/02664763.2019.1630372. eCollection 2020.
2
A Deep Learning Mammography-based Model for Improved Breast Cancer Risk Prediction.基于深度学习的乳腺 X 线摄影模型提高乳腺癌风险预测。
Radiology. 2019 Jul;292(1):60-66. doi: 10.1148/radiol.2019182716. Epub 2019 May 7.
3
Limitations of Bayesian Leave-One-Out Cross-Validation for Model Selection.
用于模型选择的贝叶斯留一法交叉验证的局限性。
Comput Brain Behav. 2019;2(1):1-11. doi: 10.1007/s42113-018-0011-7. Epub 2018 Sep 27.
4
Cross-validation failure: Small sample sizes lead to large error bars.交叉验证失败:样本量小导致误差幅度大。
Neuroimage. 2018 Oct 15;180(Pt A):68-77. doi: 10.1016/j.neuroimage.2017.06.061. Epub 2017 Jun 24.
5
Note on the sampling error of the difference between correlated proportions or percentages.关于相关比例或百分比差异的抽样误差说明。
Psychometrika. 1947 Jun;12(2):153-7. doi: 10.1007/BF02295996.
6
Stratification bias in low signal microarray studies.低信号微阵列研究中的分层偏差。
BMC Bioinformatics. 2007 Sep 2;8:326. doi: 10.1186/1471-2105-8-326.
7
Statistical inference in generalized linear mixed models: a review.广义线性混合模型中的统计推断:综述
Br J Math Stat Psychol. 2006 Nov;59(Pt 2):225-55. doi: 10.1348/000711005X79857.
8
The comparison of percentages in matched samples.匹配样本中百分比的比较。
Biometrika. 1950 Dec;37(3-4):256-66.
9
Cross-Validation Methods.交叉验证方法。
J Math Psychol. 2000 Mar;44(1):108-132. doi: 10.1006/jmps.1999.1279.
10
Combined 5 x 2 cv F test for comparing supervised classification learning algorithms.用于比较监督分类学习算法的组合5 x 2交叉验证F检验
Neural Comput. 1999 Nov 15;11(8):1885-92. doi: 10.1162/089976699300016007.