基于 Apache Spark 的混合机器学习预测慢性肾脏病。

Predicting Chronic Kidney Disease Using Hybrid Machine Learning Based on Apache Spark.

机构信息

Department of Information Systems, Faculty of Computers and Artificial Intelligence, Helwan University, Cairo, Egypt.

Faculty of Informatics and Computer Science, British University, Egypt, Cairo, Egypt.

出版信息

Comput Intell Neurosci. 2022 Feb 23;2022:9898831. doi: 10.1155/2022/9898831. eCollection 2022.

DOI:10.1155/2022/9898831

PMID:35251161

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8890824/

Abstract

Chronic kidney disease (CKD) has become a widespread disease among people. It is related to various serious risks like cardiovascular disease, heightened risk, and end-stage renal disease, which can be feasibly avoidable by early detection and treatment of people in danger of this disease. The machine learning algorithm is a source of significant assistance for medical scientists to diagnose the disease accurately in its outset stage. Recently, Big Data platforms are integrated with machine learning algorithms to add value to healthcare. Therefore, this paper proposes hybrid machine learning techniques that include feature selection methods and machine learning classification algorithms based on big data platforms (Apache Spark) that were used to detect chronic kidney disease (CKD). The feature selection techniques, namely, Relief-F and chi-squared feature selection method, were applied to select the important features. Six machine learning classification algorithms were used in this research: decision tree (DT), logistic regression (LR), Naive Bayes (NB), Random Forest (RF), support vector machine (SVM), and Gradient-Boosted Trees (GBT Classifier) as ensemble learning algorithms. Four methods of evaluation, namely, accuracy, precision, recall, and F1-measure, were applied to validate the results. For each algorithm, the results of cross-validation and the testing results have been computed based on full features, the features selected by Relief-F, and the features selected by chi-squared feature selection method. The results showed that SVM, DT, and GBT Classifiers with the selected features had achieved the best performance at 100% accuracy. Overall, Relief-F's selected features are better than full features and the features selected by chi-square.

摘要

慢性肾脏病（CKD）已成为一种广泛存在于人群中的疾病。它与各种严重的风险相关，如心血管疾病、风险增加和终末期肾病，通过对处于疾病危险中的人群进行早期检测和治疗，这些风险是可以切实避免的。机器学习算法是医学科学家在疾病早期准确诊断疾病的重要辅助手段。最近，大数据平台与机器学习算法相结合，为医疗保健增加了价值。因此，本文提出了基于大数据平台（Apache Spark）的混合机器学习技术，其中包括特征选择方法和机器学习分类算法，用于检测慢性肾脏病（CKD）。特征选择技术，如 Relief-F 和卡方特征选择方法，用于选择重要特征。本研究使用了六种机器学习分类算法：决策树（DT）、逻辑回归（LR）、朴素贝叶斯（NB）、随机森林（RF）、支持向量机（SVM）和梯度提升树（GBT 分类器）作为集成学习算法。应用了四种评估方法，即准确率、精度、召回率和 F1 度量，以验证结果。对于每种算法，基于全特征、Relief-F 选择的特征和卡方特征选择方法选择的特征，计算了交叉验证和测试结果。结果表明，在 100%准确率方面，SVM、DT 和 GBT 分类器与所选特征的性能最佳。总体而言，Relief-F 选择的特征优于全特征和卡方选择的特征。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c79/8890824/4241de59cea1/CIN2022-9898831.001.jpg

相似文献

Predicting Chronic Kidney Disease Using Hybrid Machine Learning Based on Apache Spark.

Comput Intell Neurosci. 2022 Feb 23;2022:9898831. doi: 10.1155/2022/9898831. eCollection 2022.

Prediction and feature selection of low birth weight using machine learning algorithms.

J Health Popul Nutr. 2024 Oct 12;43(1):157. doi: 10.1186/s41043-024-00647-8.

Machine Learning Hybrid Model for the Prediction of Chronic Kidney Disease.

Comput Intell Neurosci. 2023 Mar 14;2023:9266889. doi: 10.1155/2023/9266889. eCollection 2023.

Developing Multiagent E-Learning System-Based Machine Learning and Feature Selection Techniques.

Comput Intell Neurosci. 2022 Jan 30;2022:2941840. doi: 10.1155/2022/2941840. eCollection 2022.

Bayesian Optimization with Support Vector Machine Model for Parkinson Disease Classification.

Sensors (Basel). 2023 Feb 13;23(4):2085. doi: 10.3390/s23042085.

Soft Clustering for Enhancing the Diagnosis of Chronic Diseases over Machine Learning Algorithms.

J Healthc Eng. 2020 Mar 9;2020:4984967. doi: 10.1155/2020/4984967. eCollection 2020.

Comparison of supervised machine learning classification techniques in prediction of locoregional recurrences in early oral tongue cancer.

Int J Med Inform. 2020 Apr;136:104068. doi: 10.1016/j.ijmedinf.2019.104068. Epub 2019 Dec 28.

HAPI: An efficient Hybrid Feature Engineering-based Approach for Propaganda Identification in social media.

PLoS One. 2024 Jul 10;19(7):e0302583. doi: 10.1371/journal.pone.0302583. eCollection 2024.

Diagnosis of Chronic Kidney Disease Using Effective Classification Algorithms and Recursive Feature Elimination Techniques.

J Healthc Eng. 2021 Jun 9;2021:1004767. doi: 10.1155/2021/1004767. eCollection 2021.

A Highly Discriminative Hybrid Feature Selection Algorithm for Cancer Diagnosis.

ScientificWorldJournal. 2022 Aug 9;2022:1056490. doi: 10.1155/2022/1056490. eCollection 2022.

引用本文的文献

A Simplified Machine Learning Model for Predicting Reduced Kidney Function in Thai Patients with Type 2 Diabetes: A Retrospective Study.

J Clin Med. 2025 Jul 4;14(13):4735. doi: 10.3390/jcm14134735.

Applicability Assessment of Technologies for Predictive and Prescriptive Analytics of Nephrology Big Data.

Proteomics. 2025 Jun;25(11-12):e202400135. doi: 10.1002/pmic.202400135. Epub 2025 May 27.

Advanced applications in chronic disease monitoring using IoT mobile sensing device data, machine learning algorithms and frame theory: a systematic review.

Front Public Health. 2025 Feb 21;13:1510456. doi: 10.3389/fpubh.2025.1510456. eCollection 2025.

Machine learning to predict adverse drug events based on electronic health records: a systematic review and meta-analysis.

J Int Med Res. 2024 Dec;52(12):3000605241302304. doi: 10.1177/03000605241302304.

Development, validation and economic evaluation of a machine learning algorithm for predicting the probability of kidney damage in patients with hyperuricaemia: protocol for a retrospective study.

BMJ Open. 2024 Nov 28;14(11):e086032. doi: 10.1136/bmjopen-2024-086032.

Predicting the Progression of Chronic Kidney Disease: A Systematic Review of Artificial Intelligence and Machine Learning Approaches.

Cureus. 2024 May 12;16(5):e60145. doi: 10.7759/cureus.60145. eCollection 2024 May.

Innovating Personalized Nephrology Care: Exploring the Potential Utilization of ChatGPT.

J Pers Med. 2023 Dec 4;13(12):1681. doi: 10.3390/jpm13121681.

Chronic kidney disease prediction using boosting techniques based on clinical parameters.

PLoS One. 2023 Dec 1;18(12):e0295234. doi: 10.1371/journal.pone.0295234. eCollection 2023.

Machine learning-based multimodal MRI texture analysis for assessing renal function and fibrosis in diabetic nephropathy: a retrospective study.

Front Endocrinol (Lausanne). 2023 Apr 17;14:1050078. doi: 10.3389/fendo.2023.1050078. eCollection 2023.

Revolutionizing Chronic Kidney Disease Management with Machine Learning and Artificial Intelligence.

J Clin Med. 2023 Apr 21;12(8):3018. doi: 10.3390/jcm12083018.

本文引用的文献

Diagnosis of Chronic Kidney Disease Using Effective Classification Algorithms and Recursive Feature Elimination Techniques.

J Healthc Eng. 2021 Jun 9;2021:1004767. doi: 10.1155/2021/1004767. eCollection 2021.

Neural network and support vector machine for the prediction of chronic kidney disease: A comparative study.

Comput Biol Med. 2019 Jun;109:101-111. doi: 10.1016/j.compbiomed.2019.04.017. Epub 2019 Apr 25.

A hierarchical method based on weighted extreme gradient boosting in ECG heartbeat classification.

Comput Methods Programs Biomed. 2019 Apr;171:1-10. doi: 10.1016/j.cmpb.2019.02.005. Epub 2019 Feb 20.

Disparities in Chronic Kidney Disease Prevalence among Males and Females in 195 Countries: Analysis of the Global Burden of Disease 2016 Study.

Nephron. 2018;139(4):313-318. doi: 10.1159/000489897. Epub 2018 May 23.

Big Data and Machine Learning in Health Care.

JAMA. 2018 Apr 3;319(13):1317-1318. doi: 10.1001/jama.2017.18391.

KDIGO clinical practice guideline for the care of kidney transplant recipients.

Am J Transplant. 2009 Nov;9 Suppl 3:S1-155. doi: 10.1111/j.1600-6143.2009.02834.x.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于 Apache Spark 的混合机器学习预测慢性肾脏病。

Predicting Chronic Kidney Disease Using Hybrid Machine Learning Based on Apache Spark.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献