Suppr超能文献

纳入饮食摄入量的机器学习算法在妊娠期糖尿病预测中的应用。

Application of machine learning algorithm incorporating dietary intake in prediction of gestational diabetes mellitus.

作者信息

Ding Tianze, Liu Peijie, Jia Jie, Wu Hui, Zhu Jie, Yang Kefeng

机构信息

Department of Clinical Nutrition, Xin Hua Hospital Affiliated to School of Medicine, Shanghai Jiao Tong University, Shanghai, China.

Department of Clinical Nutrition, College of Heath Science and Technology, School of Medicine, Shanghai Jiao Tong University, Shanghai, China.

出版信息

Endocr Connect. 2024 Nov 21;13(12). doi: 10.1530/EC-24-0169. Print 2024 Dec 1.

Abstract

INTRODUCTION

Gestational diabetes mellitus (GDM) significantly affects pregnancy outcomes. Therefore, it is crucial to develop prediction models since they can guide timely interventions to reduce the incidence of GDM and its associated adverse effects.

METHODS

A total of 554 pregnant women were selected and their sociodemographic characteristics, clinical data and dietary data were collected. Dietary data were investigated by a validated semi-quantitative food frequency questionnaire (FFQ). We applied random forest mean decrease impurity for feature selection and the models are built using logistic regression, XGBoost, and LightGBM algorithms. The prediction performance of different models was compared by accuracy, sensitivity, specificity, area under curve (AUC) and Hosmer-Lemeshow test.

RESULTS

Blood glucose, age, pre-pregnancy body mass index (BMI), triglycerides and high-density lipoprotein cholesterol (HDL) were the top five features according to the feature selection. Among the three algorithms, XGBoost performed best with an AUC of 0.788, LightGBM came second (AUC = 0.749), and logistic regression performed the worst (AUC = 0.712). In addition, XGBoost and LightGBM both achieved a fairly good performance when dietary information was included, surpassing their performance on the non-dietary dataset (0.788 vs 0.718 in XGBoost; 0.749 vs 0.726 in LightGBM).

CONCLUSION

XGBoost and LightGBM algorithms outperform logistic regression in predicting GDM among Chinese pregnant women. In addition, dietary data may have a positive effect on improving model performance, which deserves more in-depth investigation with larger sample size.

摘要

引言

妊娠期糖尿病(GDM)会显著影响妊娠结局。因此,开发预测模型至关重要,因为它们可以指导及时干预,以降低GDM的发生率及其相关不良影响。

方法

共选取554名孕妇,收集她们的社会人口学特征、临床数据和饮食数据。饮食数据通过经过验证的半定量食物频率问卷(FFQ)进行调查。我们应用随机森林平均减少杂质进行特征选择,并使用逻辑回归、XGBoost和LightGBM算法构建模型。通过准确性、敏感性、特异性、曲线下面积(AUC)和Hosmer-Lemeshow检验比较不同模型的预测性能。

结果

根据特征选择,血糖、年龄、孕前体重指数(BMI)、甘油三酯和高密度脂蛋白胆固醇(HDL)是前五个特征。在这三种算法中,XGBoost表现最佳,AUC为0.788,LightGBM次之(AUC = 0.749),逻辑回归表现最差(AUC = 0.712)。此外,当纳入饮食信息时,XGBoost和LightGBM均取得了相当好的性能,超过了它们在非饮食数据集上的性能(XGBoost中为0.788对0.718;LightGBM中为0.749对0.726)。

结论

在中国孕妇中,XGBoost和LightGBM算法在预测GDM方面优于逻辑回归。此外,饮食数据可能对提高模型性能有积极作用,值得进行更大样本量的更深入研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5224/11623027/2fab7d3422e5/EC-24-0169fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验