School of Systems and Technology, Department of Software Engineering, University of Management and Technology, Lahore, 54770, Pakistan.
Department of Data Science and Artificial Intelligence, Faculty of Information Technology, Al Ahliyya Amman University, Amman, 19328, Jordan.
Sci Rep. 2024 Oct 7;14(1):23274. doi: 10.1038/s41598-024-74357-w.
Diabetes is a persistent health condition led by insufficient use or inappropriate use of insulin in the body. If left undetected, it can lead to further complications involving organ damage such as heart, lungs, and eyes. Timely detection of diabetes helps obtain the right medication, diet, and exercise plan to lead a healthy life. ML approach has been utilized to obtain rapid and reliable diabetes detection, however, existing approaches suffer from the use of limited datasets, lack of generalizability, and lower accuracy. This study proposes a novel feature extraction approach to overcome these limitations by using an ensemble of convolutional neural network (CNN) and long short-term memory (LSTM) models. Multiple datasets are combined to make a larger dataset for experiments and multiple features are utilized for investigating the efficacy of the proposed approach. Features from the extra tree classifier, CNN, and LSTM are also considered for comparison. Experimental results reveal the superb performance of CNN-LSTM-based features with random forest model obtaining a 0.99 accuracy score. This performance is further validated by comparison with existing approaches and k-fold cross-validation which shows the proposed approach provides robust results.
糖尿病是一种由体内胰岛素使用不足或使用不当引起的持续健康状况。如果未被发现,它可能导致涉及心脏、肺部和眼睛等器官损伤的进一步并发症。及时发现糖尿病有助于获得正确的药物、饮食和运动计划,从而过上健康的生活。机器学习方法已被用于快速可靠地检测糖尿病,但现有的方法存在数据集有限、缺乏通用性和准确性较低的问题。本研究提出了一种新的特征提取方法,通过使用卷积神经网络 (CNN) 和长短期记忆 (LSTM) 模型的集成来克服这些限制。组合多个数据集以构建更大的数据集进行实验,并利用多种特征来研究所提出方法的效果。还考虑了来自随机森林模型的额外树分类器、CNN 和 LSTM 的特征。实验结果表明,基于 CNN-LSTM 的特征具有出色的性能,随机森林模型获得了 0.99 的准确率。通过与现有方法和 k 折交叉验证的比较进一步验证了该方法的稳健性,结果表明所提出的方法提供了可靠的结果。