Suppr超能文献

一种用于糖尿病特征选择和分类的新方法:机器学习方法。

A Novel Approach for Feature Selection and Classification of Diabetes Mellitus: Machine Learning Methods.

机构信息

CSE Department, Gautam Buddha University, Greater Noida, India.

Cedargate Technologies, Kathmandu, Nepal.

出版信息

Comput Intell Neurosci. 2022 Apr 15;2022:3820360. doi: 10.1155/2022/3820360. eCollection 2022.

Abstract

An active research area where the experts from the medical field are trying to envisage the problem with more accuracy is diabetes prediction. Surveys conducted by WHO have shown a remarkable increase in the diabetic patients. Diabetes generally remains in dormant mode and it boosts the other diseases if patients are diagnosed with some other disease such as damage to the kidney vessels, problems in retina of the eye, and cardiac problem; if unidentified, it can create metabolic disorders and too many complications in the body. The main objective of our study is to draw a comparative study of different classifiers and feature selection methods to predict the diabetes with greater accuracy. In this paper, we have studied multilayer perceptron, decision trees, K-nearest neighbour, and random forest classifiers and few feature selection techniques were applied on the classifiers to detect the diabetes at an early stage. Raw data is subjected to preprocessing techniques, thus removing outliers and imputing missing values by mean and then in the end hyperparameters optimization. Experiments were conducted on PIMA Indians diabetes dataset using Weka 3.9 and the accuracy achieved for multilayer perceptron is 77.60%, for decision trees is 76.07%, for K-nearest neighbour is 78.58%, and for random forest is , which is by far the best accuracy for random forest classifier.

摘要

一个活跃的研究领域,医学领域的专家正在努力更准确地预见这个问题,这就是糖尿病预测。世界卫生组织进行的调查显示,糖尿病患者显著增加。糖尿病通常处于潜伏状态,如果患者被诊断出患有其他疾病,如肾脏血管损伤、眼睛视网膜问题和心脏问题,它会加重其他疾病;如果未被识别,它会导致代谢紊乱和体内出现过多并发症。我们研究的主要目标是比较不同的分类器和特征选择方法,以更准确地预测糖尿病。在本文中,我们研究了多层感知器、决策树、K-最近邻和随机森林分类器,并在分类器上应用了一些特征选择技术,以尽早发现糖尿病。原始数据经过预处理技术,通过均值去除异常值并填补缺失值,然后最终进行超参数优化。我们在 Weka 3.9 上使用 PIMA 印第安人糖尿病数据集进行了实验,多层感知器的准确率为 77.60%,决策树的准确率为 76.07%,K-最近邻的准确率为 78.58%,随机森林的准确率为 ,这是随机森林分类器迄今为止最好的准确率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/55b5/9033325/af9a615bab9c/CIN2022-3820360.001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验