Suppr
超能文献

用于疾病预测的集成学习：综述

Ensemble Learning for Disease Prediction: A Review.

作者信息

Mahajan Palak, Uddin Shahadat, Hajati Farshid, Moni Mohammad Ali

机构信息

College of Engineering and Science, Victoria University, Sydney, NSW 2000, Australia.

School of Project Management, Faculty of Engineering, The University of Sydney, Forest Lodge, NSW 2037, Australia.

出版信息

Healthcare (Basel). 2023 Jun 20;11(12):1808. doi: 10.3390/healthcare11121808.

DOI:10.3390/healthcare11121808

PMID:37372925

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10298658/

Abstract

Machine learning models are used to create and enhance various disease prediction frameworks. Ensemble learning is a machine learning technique that combines multiple classifiers to improve performance by making more accurate predictions than a single classifier. Although numerous studies have employed ensemble approaches for disease prediction, there is a lack of thorough assessment of commonly used ensemble approaches against highly researched diseases. Consequently, this study aims to identify significant trends in the performance accuracies of ensemble techniques (i.e., bagging, boosting, stacking, and voting) against five hugely researched diseases (i.e., diabetes, skin disease, kidney disease, liver disease, and heart conditions). Using a well-defined search strategy, we first identified 45 articles from the current literature that applied two or more of the four ensemble approaches to any of these five diseases and were published in 2016-2023. Although stacking has been used the fewest number of times (23) compared with bagging (41) and boosting (37), it showed the most accurate performance the most times (19 out of 23). The voting approach is the second-best ensemble approach, as revealed in this review. Stacking always revealed the most accurate performance in the reviewed articles for skin disease and diabetes. Bagging demonstrated the best performance for kidney disease (five out of six times) and boosting for liver and diabetes (four out of six times). The results show that stacking has demonstrated greater accuracy in disease prediction than the other three candidate algorithms. Our study also demonstrates variability in the perceived performance of different ensemble approaches against frequently used disease datasets. The findings of this work will assist researchers in better understanding current trends and hotspots in disease prediction models that employ ensemble learning, as well as in determining a more suitable ensemble model for predictive disease analytics. This article also discusses variability in the perceived performance of different ensemble approaches against frequently used disease datasets.

摘要

机器学习模型用于创建和增强各种疾病预测框架。集成学习是一种机器学习技术，它通过组合多个分类器来提高性能，从而做出比单个分类器更准确的预测。尽管众多研究已采用集成方法进行疾病预测，但对于针对深入研究的疾病的常用集成方法缺乏全面评估。因此，本研究旨在确定针对五种深入研究的疾病（即糖尿病、皮肤病、肾病、肝病和心脏病）的集成技术（即装袋法、提升法、堆叠法和投票法）在性能准确性方面的显著趋势。通过使用明确的搜索策略，我们首先从当前文献中识别出45篇文章，这些文章在2016年至2023年期间将四种集成方法中的两种或更多种应用于这五种疾病中的任何一种。与装袋法（41次）和提升法（37次）相比，堆叠法使用次数最少（23次），但其在大多数情况下（23次中的19次）表现出最准确的性能。本综述表明，投票法是第二好的集成方法。在综述文章中，堆叠法在皮肤病和糖尿病方面始终表现出最准确的性能。装袋法在肾病方面表现最佳（六次中有五次），提升法在肝病和糖尿病方面表现最佳（六次中有四次）。结果表明，堆叠法在疾病预测中比其他三种候选算法具有更高的准确性。我们的研究还表明，针对常用疾病数据集，不同集成方法的感知性能存在差异。这项工作的结果将帮助研究人员更好地理解采用集成学习的疾病预测模型的当前趋势和热点，以及确定更适合预测性疾病分析的集成模型。本文还讨论了针对常用疾病数据集，不同集成方法的感知性能差异。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/503b/10298658/af6f13627a18/healthcare-11-01808-g001.jpg

相似文献

Ensemble Learning for Disease Prediction: A Review.

Healthcare (Basel). 2023 Jun 20;11(12):1808. doi: 10.3390/healthcare11121808.

Optimizing Skin Cancer Survival Prediction with Ensemble Techniques.

Bioengineering (Basel). 2023 Dec 31;11(1):43. doi: 10.3390/bioengineering11010043.

A GA-stacking ensemble approach for forecasting energy consumption in a smart household: A comparative study of ensemble methods.

J Environ Manage. 2024 Jul;364:121264. doi: 10.1016/j.jenvman.2024.121264. Epub 2024 Jun 12.

Prediction of Skin Disease Using Ensemble Data Mining Techniques and Feature Selection Method-a Comparative Study.

Appl Biochem Biotechnol. 2020 Feb;190(2):341-359. doi: 10.1007/s12010-019-03093-z. Epub 2019 Jul 27.

A Machine Learning Ensemble Classifier for Early Prediction of Diabetic Retinopathy.

J Med Syst. 2017 Nov 9;41(12):201. doi: 10.1007/s10916-017-0853-x.

Comprehensive hepatotoxicity prediction: ensemble model integrating machine learning and deep learning.

Front Pharmacol. 2024 Aug 21;15:1441587. doi: 10.3389/fphar.2024.1441587. eCollection 2024.

Effective treatment of imbalanced datasets in health care using modified SMOTE coupled with stacked deep learning algorithms.

Appl Nanosci. 2023;13(3):1829-1840. doi: 10.1007/s13204-021-02063-4. Epub 2022 Feb 3.

Real-time milk analysis integrated with stacking ensemble learning as a tool for the daily prediction of cheese-making traits in Holstein cattle.

J Dairy Sci. 2022 May;105(5):4237-4255. doi: 10.3168/jds.2021-21426. Epub 2022 Mar 10.

Novel ensemble intelligence methodologies for rockburst assessment in complex and variable environments.

Sci Rep. 2022 Feb 3;12(1):1844. doi: 10.1038/s41598-022-05594-0.

Classification of Normal and Malicious Traffic Based on an Ensemble of Machine Learning for a Vehicle CAN-Network.

Sensors (Basel). 2022 Nov 26;22(23):9195. doi: 10.3390/s22239195.

引用本文的文献

Development and validation of a machine learning model for predicting venous thromboembolism complications following colorectal cancer surgery.

Vis Comput Ind Biomed Art. 2025 Sep 12;8(1):22. doi: 10.1186/s42492-025-00204-y.

Prediction of lymph node metastasis in lung adenocarcinoma using a PET/CT radiomics-based ensemble learning model and its pathological basis.

Front Oncol. 2025 Aug 25;15:1618494. doi: 10.3389/fonc.2025.1618494. eCollection 2025.

ACLPred: an explainable machine learning and tree-based ensemble model for anticancer ligand prediction.

Sci Rep. 2025 Aug 25;15(1):31268. doi: 10.1038/s41598-025-16575-4.

Machine Learning Model Integrating Computed Tomography Image-Derived Radiomics and Circulating miRNAs to Predict Residual Teratoma in Metastatic Nonseminoma Testicular Cancer.

JCO Clin Cancer Inform. 2025 Aug;9:e2500105. doi: 10.1200/CCI-25-00105. Epub 2025 Aug 25.

A stacking ensemble framework integrating radiomics and deep learning for prognostic prediction in head and neck cancer.

Radiat Oncol. 2025 Aug 13;20(1):127. doi: 10.1186/s13014-025-02695-8.

The Use of Selected Machine Learning Methods in Dairy Cattle Farming: A Review.

Animals (Basel). 2025 Jul 10;15(14):2033. doi: 10.3390/ani15142033.

Development and validation of a dynamic early warning system with time-varying machine learning models for predicting hemodynamic instability in critical care: a multicohort study.

Crit Care. 2025 Jul 23;29(1):318. doi: 10.1186/s13054-025-05553-x.

Prediction of genomic biomarkers for endometriosis using the transcriptomic dataset.

World J Clin Cases. 2025 Jul 16;13(20):104556. doi: 10.12998/wjcc.v13.i20.104556.

Development and validation of machine learning models for osteoporosis prediction in chronic kidney disease patients: Data from National Health and Nutrition Examination survey.

Digit Health. 2025 Jul 7;11:20552076251357758. doi: 10.1177/20552076251357758. eCollection 2025 Jan-Dec.

A Novel Cooperative AI-Based Fall Risk Prediction Model for Older Adults.

Sensors (Basel). 2025 Jun 26;25(13):3991. doi: 10.3390/s25133991.

本文引用的文献

Ensemble Learning Based on Hybrid Deep Learning Model for Heart Disease Early Prediction.

Diagnostics (Basel). 2022 Dec 18;12(12):3215. doi: 10.3390/diagnostics12123215.

Machine Learning Models for Data-Driven Prediction of Diabetes by Lifestyle Type.

Int J Environ Res Public Health. 2022 Nov 15;19(22):15027. doi: 10.3390/ijerph192215027.

Cardiovascular Disease Detection using Ensemble Learning.

Comput Intell Neurosci. 2022 Aug 16;2022:5267498. doi: 10.1155/2022/5267498. eCollection 2022.

A novel stacking ensemble for detecting three types of diabetes mellitus using a Saudi Arabian dataset: Pre-diabetes, T1DM, and T2DM.

Comput Biol Med. 2022 Aug;147:105757. doi: 10.1016/j.compbiomed.2022.105757. Epub 2022 Jun 20.

Ensemble framework for cardiovascular disease prediction.

Comput Biol Med. 2022 Jul;146:105624. doi: 10.1016/j.compbiomed.2022.105624. Epub 2022 May 17.

The PRISMA 2020 statement: An updated guideline for reporting systematic reviews.

Int J Surg. 2021 Apr;88:105906. doi: 10.1016/j.ijsu.2021.105906. Epub 2021 Mar 29.

Random forest swarm optimization-based for heart diseases diagnosis.

J Biomed Inform. 2021 Mar;115:103690. doi: 10.1016/j.jbi.2021.103690. Epub 2021 Feb 1.

A data-driven approach to predicting diabetes and cardiovascular disease with machine learning.

BMC Med Inform Decis Mak. 2019 Nov 6;19(1):211. doi: 10.1186/s12911-019-0918-5.

Investing in non-communicable diseases: an estimation of the return on investment for prevention and treatment services.

Lancet. 2018 May 19;391(10134):2071-2078. doi: 10.1016/S0140-6736(18)30665-2. Epub 2018 Apr 5.

A novel method for predicting kidney stone type using ensemble learning.

Artif Intell Med. 2018 Jan;84:117-126. doi: 10.1016/j.artmed.2017.12.001. Epub 2017 Dec 11.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr超能文献

用于疾病预测的集成学习：综述

Ensemble Learning for Disease Prediction: A Review.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译