Suppr超能文献

ML-CKDP:基于机器学习的慢性肾病预测与智能网络应用程序

ML-CKDP: Machine learning-based chronic kidney disease prediction with smart web application.

作者信息

Halder Rajib Kumar, Uddin Mohammed Nasir, Uddin Md Ashraf, Aryal Sunil, Saha Sajeeb, Hossen Rakib, Ahmed Sabbir, Rony Mohammad Abu Tareq, Akter Mosammat Farida

机构信息

Dept. of Computer Science and Engineering, Jagannath University, Dhaka 1100, Bangladesh.

School of Information Technology, Deakin University, Geelong 3125, Australia.

出版信息

J Pathol Inform. 2024 Feb 22;15:100371. doi: 10.1016/j.jpi.2024.100371. eCollection 2024 Dec.

Abstract

Chronic kidney diseases (CKDs) are a significant public health issue with potential for severe complications such as hypertension, anemia, and renal failure. Timely diagnosis is crucial for effective management. Leveraging machine learning within healthcare offers promising advancements in predictive diagnostics. In this paper, we developed a machine learning-based kidney diseases prediction (ML-CKDP) model with dual objectives: to enhance dataset preprocessing for CKD classification and to develop a web-based application for CKD prediction. The proposed model involves a comprehensive data preprocessing protocol, converting categorical variables to numerical values, imputing missing data, and normalizing via Min-Max scaling. Feature selection is executed using a variety of techniques including Correlation, Chi-Square, Variance Threshold, Recursive Feature Elimination, Sequential Forward Selection, Lasso Regression, and Ridge Regression to refine the datasets. The model employs seven classifiers: Random Forest (RF), AdaBoost (AdaB), Gradient Boosting (GB), XgBoost (XgB), Naive Bayes (NB), Support Vector Machine (SVM), and Decision Tree (DT), to predict CKDs. The effectiveness of the models is assessed by measuring their accuracy, analyzing confusion matrix statistics, and calculating the Area Under the Curve (AUC) specifically for the classification of positive cases. Random Forest (RF) and AdaBoost (AdaB) achieve a 100% accuracy rate, evident across various validation methods including data splits of 70:30, 80:20, and K-Fold set to 10 and 15. RF and AdaB consistently reach perfect AUC scores of 100% across multiple datasets, under different splitting ratios. Moreover, Naive Bayes (NB) stands out for its efficiency, recording the lowest training and testing times across all datasets and split ratios. Additionally, we present a real-time web-based application to operationalize the model, enhancing accessibility for healthcare practitioners and stakeholders. Web app link: https://rajib-research-kedney-diseases-prediction.onrender.com/.

摘要

慢性肾脏病(CKD)是一个重大的公共卫生问题,具有引发高血压、贫血和肾衰竭等严重并发症的可能性。及时诊断对于有效管理至关重要。在医疗保健领域利用机器学习在预测诊断方面提供了有前景的进展。在本文中,我们开发了一种基于机器学习的肾脏疾病预测(ML-CKDP)模型,其具有双重目标:增强用于CKD分类的数据集预处理,并开发一个基于网络的CKD预测应用程序。所提出的模型涉及一个全面的数据预处理协议,将分类变量转换为数值,插补缺失数据,并通过最小-最大缩放进行归一化。使用包括相关性、卡方检验、方差阈值、递归特征消除、顺序前向选择、套索回归和岭回归等多种技术来执行特征选择,以优化数据集。该模型采用七个分类器:随机森林(RF)、自适应增强(AdaB)、梯度提升(GB)、XGBoost(XgB)、朴素贝叶斯(NB)、支持向量机(SVM)和决策树(DT)来预测CKD。通过测量模型的准确性、分析混淆矩阵统计数据以及专门针对阳性病例分类计算曲线下面积(AUC)来评估模型的有效性。随机森林(RF)和自适应增强(AdaB)实现了100%的准确率,在包括70:30、80:20的数据分割以及设置为10和15的K折交叉验证等各种验证方法中都很明显。RF和AdaB在不同的分割比例下,在多个数据集中始终达到100%的完美AUC分数。此外,朴素贝叶斯(NB)因其效率而脱颖而出,在所有数据集和分割比例下记录了最低的训练和测试时间。此外,我们展示了一个基于网络的实时应用程序来运行该模型,提高了医疗从业者和利益相关者的可及性。网络应用链接:https://rajib-research-kedney-diseases-prediction.onrender.com/

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e5c/10950726/3cff48cf694b/ga1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验