用于中老年人群糖尿病诊断的集成装袋-随机森林学习模型

Integrated bagging-RF learning model for diabetes diagnosis in middle-aged and elderly population.

作者信息

Shi Yuanwu, Sun Jiuye

机构信息

College of Art and Design, Wuhan Textile University, Wuhan, Hubei, China.

出版信息

PeerJ Comput Sci. 2024 Oct 31;10:e2436. doi: 10.7717/peerj-cs.2436. eCollection 2024.

DOI:10.7717/peerj-cs.2436

PMID:39650520

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11623014/

Abstract

As the population ages, the increase in the number of middle-aged and older adults with diabetes poses new challenges to the allocation of resources in the healthcare system. Developing accurate diabetes prediction models is a critical public health strategy to improve the efficient use of healthcare resources and ensure timely and effective treatment. In order to improve the identification of diabetes in middle-aged and older patients, a Bagging-RF model is proposed. In the study, two diabetes datasets on Kaggle were first preprocessed, including unique heat coding, outlier removal, and age screening, after which the data were categorized into three age groups, 50-60, 60-70, and 70-80, and balanced using the SMOTE technique. Then, the machine learning classifiers were trained using the Bagging-RF integrated model with eight other machine learning classifiers. Finally, the model's performance was evaluated by accuracy, 1 score, and other metrics. The results showed that the Bagging-RF model outperformed the other eight machine learning classifiers, exhibiting 97.35%, 95.55%, 95.14% accuracy and 97.35%, 97.35%, 95.14% 1 Score at the Diabetes Prediction Dataset for diabetes prediction for the three age groups of 50-60, 60-70, and 70-80; and 97.03%, 94.90%, 93.70% accuracy and 97.03%, 94.90%, 93.70% 1 Score at the Diabetes Prediction Dataset. 95.55%, 95.13% 1 Score; and 97.03%, 94.90%, 93.70% accuracy; and 97.03%, 94.89%, 93.70% 1 Score at Diabetes Prediction Dataset. In addition, while other integrated learning models, such as ET, RF, Adaboost, and XGB, fail to outperform Bagging-RF, they also show excellent performance.

摘要

随着人口老龄化，糖尿病中老年患者数量的增加给医疗保健系统的资源分配带来了新挑战。开发准确的糖尿病预测模型是提高医疗资源利用效率并确保及时有效治疗的一项关键公共卫生策略。为了提高中老年患者糖尿病的识别率，提出了一种Bagging-RF模型。在该研究中，首先对Kaggle上的两个糖尿病数据集进行预处理，包括独热编码、异常值去除和年龄筛选，之后将数据分为50 - 60岁、60 - 70岁和70 - 80岁三个年龄组，并使用SMOTE技术进行平衡处理。然后，使用Bagging-RF集成模型与其他八个机器学习分类器对机器学习分类器进行训练。最后，通过准确率、F1分数等指标对模型性能进行评估。结果表明，Bagging-RF模型优于其他八个机器学习分类器，在50 - 60岁、60 - 70岁和70 - 80岁三个年龄组的糖尿病预测数据集上，糖尿病预测的准确率分别为97.35%、95.55%、95.14%，F1分数分别为97.35%、97.35%、95.14%；在糖尿病预测数据集上的准确率分别为97.03%、94.90%、93.70%，F1分数分别为97.03%、94.90%、93.70%，以及95.55%、95.13%的F1分数；在糖尿病预测数据集上的准确率分别为97.03%、94.90%、93.70%，F1分数分别为97.03%、94.89%、93.70%。此外，虽然其他集成学习模型，如ET、RF、Adaboost和XGB，未能超过Bagging-RF，但它们也表现出了出色的性能。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于中老年人群糖尿病诊断的集成装袋-随机森林学习模型

Integrated bagging-RF learning model for diabetes diagnosis in middle-aged and elderly population.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

用于中老年人群糖尿病诊断的集成装袋-随机森林学习模型

Integrated bagging-RF learning model for diabetes diagnosis in middle-aged and elderly population.

作者信息

机构信息

出版信息

相似文献

本文引用的文献