Suppr超能文献

使用稳健模型检测人体测量学因素与2型糖尿病之间的关联:机器学习方法

Using a robust model to detect the association between anthropometric factors and T2DM: machine learning approaches.

作者信息

Hosseini Nafiseh, Tanzadehpanah Hamid, Mansoori Amin, Sabzekar Mostafa, Ferns Gordon A, Esmaily Habibollah, Ghayour-Mobarhan Majid

机构信息

International UNESCO Center for Health-Related Basic Sciences and Human Nutrition, Mashhad University of Medical Sciences, Mashhad, 99199-91766, Iran.

Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.

出版信息

BMC Med Inform Decis Mak. 2025 Jan 31;25(1):49. doi: 10.1186/s12911-025-02887-y.

Abstract

BACKGROUND

The aim of this study was to evaluate the potential models to determine the most important anthropometric factors associated with type 2 diabetes mellitus (T2DM).

METHOD

A dataset derived from the Mashhad Stroke and heart atherosclerotic disorders (MASHAD) study comprising 9354 subject aged 65 - 35. 25% (2336 people) of subjects were diabetic and 75% (7018 people) where non-diabetic was used for the analysis of 10 anthropometric factors and age that were measured in all patients. A K-nearest neighbor (KNN) model was used to assess the association between T2DM and selected factors. The model was evaluated using accuracy, sensitivity, specificity, precision and f1-measure parameters. The receiver operating characteristic (ROC) curve and factor importance analysis were also determined. The performance of the KNN model was compared with Artificial neural network (ANN) and support vector machine (SVM) models.

RESULT

After feature selection analysis and assessing multicollinearity, six factors (Mid-arm Circumference (MAC), Waist Circumference (WC), Body Roundness Index (BRI), Body Adiposity Index (BAI), Body Mass Index (BMI), age) were used in the final model. BRI, BAI and MAC factors in males and BMI, BRI, and MAC factors in females were found to have the greatest association with T2DM. The accuracy of the KNN model was approximately 93% for both genders. The best K (number of neighbors) for the model was 4 which had the lowest error rate. The area under the ROC curve (AUC) was 0.985 for men and 0.986 for women. The KNN model achieved the best result of the models explored.

CONCLUSION

The KNN model had a high accuracy (93%) for predicting the association between anthropometric factors and T2DM. Selecting the K parameter (nearest neighbor) has an essential impact on reducing the error rate. Feature selection analysis reduces the dimensions of the KNN model and increases the accuracy of final results.

摘要

背景

本研究的目的是评估潜在模型,以确定与2型糖尿病(T2DM)相关的最重要人体测量因素。

方法

一个源自马什哈德中风和心脏动脉粥样硬化疾病(MASHAD)研究的数据集,包含9354名年龄在65 - 35岁的受试者。25%(2336人)的受试者患有糖尿病,75%(7018人)为非糖尿病患者,用于分析所有患者测量的10个人体测量因素和年龄。使用K近邻(KNN)模型评估T2DM与选定因素之间的关联。使用准确率、灵敏度、特异性、精确率和F1值参数对模型进行评估。还确定了受试者工作特征(ROC)曲线和因素重要性分析。将KNN模型的性能与人工神经网络(ANN)和支持向量机(SVM)模型进行比较。

结果

经过特征选择分析和评估多重共线性后,最终模型使用了六个因素(上臂围(MAC)、腰围(WC)、身体圆润度指数(BRI)、身体脂肪指数(BAI)、体重指数(BMI)、年龄)。发现男性的BRI、BAI和MAC因素以及女性的BMI、BRI和MAC因素与T2DM的关联最大。KNN模型对男女的准确率均约为93%。该模型的最佳K值(邻居数量)为4,其错误率最低。男性的ROC曲线下面积(AUC)为0.985,女性为0.986。KNN模型在所探索的模型中取得了最佳结果。

结论

KNN模型在预测人体测量因素与T2DM之间的关联方面具有较高的准确率(93%)。选择K参数(最近邻)对降低错误率有至关重要的影响。特征选择分析减少了KNN模型的维度并提高了最终结果的准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/680c/11786328/7d651c0418f9/12911_2025_2887_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验