Suppr超能文献

基于机器学习的高血糖预测:增强未确诊个体队列中的风险评估

Machine Learning-Based Hyperglycemia Prediction: Enhancing Risk Assessment in a Cohort of Undiagnosed Individuals.

作者信息

Oyebola Kolapo, Ligali Funmilayo, Owoloye Afolabi, Erinwusi Blessing, Alo Yetunde, Musa Adesola Z, Aina Oluwagbemiga, Salako Babatunde

机构信息

Nigerian Institute of Medical Research, Lagos, Nigeria.

Centre for Genomic Research in Biomedicine, Mountain Top University, Ibafo, Nigeria.

出版信息

JMIRx Med. 2024 Sep 11;5:e56993. doi: 10.2196/56993.

Abstract

BACKGROUND

Noncommunicable diseases continue to pose a substantial health challenge globally, with hyperglycemia serving as a prominent indicator of diabetes.

OBJECTIVE

This study employed machine learning algorithms to predict hyperglycemia in a cohort of individuals who were asymptomatic and unraveled crucial predictors contributing to early risk identification.

METHODS

This dataset included an extensive array of clinical and demographic data obtained from 195 adults who were asymptomatic and residing in a suburban community in Nigeria. The study conducted a thorough comparison of multiple machine learning algorithms to ascertain the most effective model for predicting hyperglycemia. Moreover, we explored feature importance to pinpoint correlates of high blood glucose levels within the cohort.

RESULTS

Elevated blood pressure and prehypertension were recorded in 8 (4.1%) and 18 (9.2%) of the 195 participants, respectively. A total of 41 (21%) participants presented with hypertension, of which 34 (83%) were female. However, sex adjustment showed that 34 of 118 (28.8%) female participants and 7 of 77 (9%) male participants had hypertension. Age-based analysis revealed an inverse relationship between normotension and age (r=-0.88; P=.02). Conversely, hypertension increased with age (r=0.53; P=.27), peaking between 50-59 years. Of the 195 participants, isolated systolic hypertension and isolated diastolic hypertension were recorded in 16 (8.2%) and 15 (7.7%) participants, respectively, with female participants recording a higher prevalence of isolated systolic hypertension (11/16, 69%) and male participants reporting a higher prevalence of isolated diastolic hypertension (11/15, 73%). Following class rebalancing, the random forest classifier gave the best performance (accuracy score 0.89; receiver operating characteristic-area under the curve score 0.89; F1-score 0.89) of the 26 model classifiers. The feature selection model identified uric acid and age as important variables associated with hyperglycemia.

CONCLUSIONS

The random forest classifier identified significant clinical correlates associated with hyperglycemia, offering valuable insights for the early detection of diabetes and informing the design and deployment of therapeutic interventions. However, to achieve a more comprehensive understanding of each feature's contribution to blood glucose levels, modeling additional relevant clinical features in larger datasets could be beneficial.

摘要

背景

非传染性疾病继续在全球范围内对健康构成重大挑战,高血糖是糖尿病的一个突出指标。

目的

本研究采用机器学习算法预测一组无症状个体的高血糖情况,并找出有助于早期风险识别的关键预测因素。

方法

该数据集包括从195名居住在尼日利亚一个郊区社区的无症状成年人那里获取的大量临床和人口统计学数据。该研究对多种机器学习算法进行了全面比较,以确定预测高血糖的最有效模型。此外,我们还探讨了特征重要性,以找出该队列中血糖水平升高的相关因素。

结果

195名参与者中,分别有8人(4.1%)和18人(9.2%)出现血压升高和高血压前期。共有41人(21%)患有高血压,其中34人(83%)为女性。然而,性别调整显示,118名女性参与者中有34人(28.8%)患有高血压,77名男性参与者中有7人(9%)患有高血压。基于年龄的分析显示,正常血压与年龄呈负相关(r=-0.88;P=0.02)。相反,高血压随年龄增长而增加(r=0.53;P=0.27),在50-59岁之间达到峰值。在195名参与者中,分别有16人(8.2%)和15人(7.7%)被记录为单纯收缩期高血压和单纯舒张期高血压,女性参与者中单纯收缩期高血压的患病率较高(11/16,69%),男性参与者中单纯舒张期高血压的患病率较高(11/15,73%)。经过类别重新平衡后,随机森林分类器在26个模型分类器中表现最佳(准确率得分0.89;受试者工作特征曲线下面积得分0.89;F1得分0.89)。特征选择模型确定尿酸和年龄是与高血糖相关的重要变量。

结论

随机森林分类器确定了与高血糖相关的重要临床相关因素,为糖尿病的早期检测提供了有价值的见解,并为治疗干预措施的设计和实施提供了参考。然而,为了更全面地了解每个特征对血糖水平的贡献,在更大的数据集中对其他相关临床特征进行建模可能会有所帮助。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f89/11441453/dc2a1e25f193/xmed-v5-e56993-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验