College of Computer Science and Engineering, University of Hafr Al-Batin, Hafr Al-Batin, 39524, Saudi Arabia.
Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, 11671, Saudi Arabia.
BMC Med Res Methodol. 2024 Sep 27;24(1):221. doi: 10.1186/s12874-024-02324-0.
Diabetes is thought to be the most common illness in underdeveloped nations. Early detection and competent medical care are crucial steps in reducing the effects of diabetes. Examining the signs associated with diabetes is one of the most effective ways to identify the condition. The problem of missing data is not very well investigated in existing works. In addition, existing studies on diabetes detection lack accuracy and robustness. The available datasets frequently contain missing information for the automated detection of diabetes, which might negatively impact machine learning model performance. This work suggests an automated diabetes prediction method that achieves high accuracy and effectively manages missing variables in order to address this problem. The proposed strategy employs a stacked ensemble voting classifier model with three machine learning models. and a KNN Imputer to handle missing values. Using the KNN imputer, the suggested model performs exceptionally well, with accuracy, precision, recall, F1 score, and MCC of 98.59%, 99.26%, 99.75%, 99.45%, and 99.24%, respectively. In two scenarios one with missing values eliminated and the other with KNN imputer, the study thoroughly compared the suggested model with seven other machine learning techniques. The outcomes demonstrate the superiority of the suggested model over current state-of-the-art methods and confirm its efficacy. This work demonstrates the capability of KNN imputer and looks at the problem of missing values for diabetes detection. Medical professionals can utilize the results to improve care for diabetes patients and discover problems early.
糖尿病被认为是欠发达国家最常见的疾病。早期发现和专业的医疗护理是减少糖尿病影响的关键步骤。检查与糖尿病相关的症状是识别这种疾病的最有效方法之一。在现有工作中,对缺失数据的问题研究得还不够充分。此外,现有的糖尿病检测研究缺乏准确性和稳健性。现有的数据集经常包含用于糖尿病自动检测的缺失信息,这可能会对机器学习模型的性能产生负面影响。这项工作提出了一种自动化的糖尿病预测方法,该方法可以实现高精度,并有效地管理缺失变量,以解决这个问题。所提出的策略采用了具有三个机器学习模型的堆叠集成投票分类器模型和 KNN 插补器来处理缺失值。使用 KNN 插补器,所提出的模型表现出色,准确率、精度、召回率、F1 得分和 MCC 分别为 98.59%、99.26%、99.75%、99.45%和 99.24%。在两种情况下,一种是消除缺失值,另一种是使用 KNN 插补器,研究彻底比较了所提出的模型与其他七种机器学习技术。结果表明,所提出的模型优于当前最先进的方法,并证实了其有效性。这项工作展示了 KNN 插补器的能力,并研究了糖尿病检测中缺失值的问题。医疗专业人员可以利用这些结果来改善对糖尿病患者的护理,并及早发现问题。
BMC Med Res Methodol. 2024-9-27
BMC Bioinformatics. 2023-9-12
Comput Biol Med. 2022-11
Comput Methods Programs Biomed. 2019-6-11
J Biomed Semantics. 2016-6-16
Front Med (Lausanne). 2025-8-7
Bioengineering (Basel). 2025-3-29
Front Genet. 2023-10-26
Healthc Technol Lett. 2022-12-14
Int J Environ Res Public Health. 2022-9-28
Sensors (Basel). 2022-7-13
J Ambient Intell Humaniz Comput. 2022-2-27
Int J Environ Res Public Health. 2021-3-23