Suppr超能文献

利用新型集成特征工程方法和机器学习模型增强对糖尿病的检测。

Enhanced detection of diabetes mellitus using novel ensemble feature engineering approach and machine learning model.

机构信息

School of Systems and Technology, Department of Software Engineering, University of Management and Technology, Lahore, 54770, Pakistan.

Department of Data Science and Artificial Intelligence, Faculty of Information Technology, Al Ahliyya Amman University, Amman, 19328, Jordan.

出版信息

Sci Rep. 2024 Oct 7;14(1):23274. doi: 10.1038/s41598-024-74357-w.

Abstract

Diabetes is a persistent health condition led by insufficient use or inappropriate use of insulin in the body. If left undetected, it can lead to further complications involving organ damage such as heart, lungs, and eyes. Timely detection of diabetes helps obtain the right medication, diet, and exercise plan to lead a healthy life. ML approach has been utilized to obtain rapid and reliable diabetes detection, however, existing approaches suffer from the use of limited datasets, lack of generalizability, and lower accuracy. This study proposes a novel feature extraction approach to overcome these limitations by using an ensemble of convolutional neural network (CNN) and long short-term memory (LSTM) models. Multiple datasets are combined to make a larger dataset for experiments and multiple features are utilized for investigating the efficacy of the proposed approach. Features from the extra tree classifier, CNN, and LSTM are also considered for comparison. Experimental results reveal the superb performance of CNN-LSTM-based features with random forest model obtaining a 0.99 accuracy score. This performance is further validated by comparison with existing approaches and k-fold cross-validation which shows the proposed approach provides robust results.

摘要

糖尿病是一种由体内胰岛素使用不足或使用不当引起的持续健康状况。如果未被发现,它可能导致涉及心脏、肺部和眼睛等器官损伤的进一步并发症。及时发现糖尿病有助于获得正确的药物、饮食和运动计划,从而过上健康的生活。机器学习方法已被用于快速可靠地检测糖尿病,但现有的方法存在数据集有限、缺乏通用性和准确性较低的问题。本研究提出了一种新的特征提取方法,通过使用卷积神经网络 (CNN) 和长短期记忆 (LSTM) 模型的集成来克服这些限制。组合多个数据集以构建更大的数据集进行实验,并利用多种特征来研究所提出方法的效果。还考虑了来自随机森林模型的额外树分类器、CNN 和 LSTM 的特征。实验结果表明,基于 CNN-LSTM 的特征具有出色的性能,随机森林模型获得了 0.99 的准确率。通过与现有方法和 k 折交叉验证的比较进一步验证了该方法的稳健性,结果表明所提出的方法提供了可靠的结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a61c/11458802/5bb7cca35c90/41598_2024_74357_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验