Suppr超能文献

基于机器学习算法预测亚洲女性妊娠期糖尿病。

Prediction of gestational diabetes mellitus in Asian women using machine learning algorithms.

机构信息

Department of Obstetrics and Gynecology, Seoul St. Mary's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea.

Department of Obstetrics and Gynecology, St. Vincent's Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea.

出版信息

Sci Rep. 2023 Aug 16;13(1):13356. doi: 10.1038/s41598-023-39680-8.

Abstract

This study developed a machine learning algorithm to predict gestational diabetes mellitus (GDM) using retrospective data from 34,387 pregnancies in multi-centers of South Korea. Variables were collected at baseline, E0 (until 10 weeks' gestation), E1 (11-13 weeks' gestation) and M1 (14-24 weeks' gestation). The data set was randomly divided into training and test sets (7:3 ratio) to compare the performances of light gradient boosting machine (LGBM) and extreme gradient boosting (XGBoost) algorithms, with a full set of variables (original). A prediction model with the whole cohort achieved area under the receiver operating characteristics curve (AUC) and area under the precision-recall curve (AUPR) values of 0.711 and 0.246 at baseline, 0.720 and 0.256 at E0, 0.721 and 0.262 at E1, and 0.804 and 0.442 at M1, respectively. Then comparison of three models with different variable sets were performed: [a] variables from clinical guidelines; [b] selected variables from Shapley additive explanations (SHAP) values; and [c] Boruta algorithms. Based on model [c] with the least variables and similar or better performance than the other models, simple questionnaires were developed. The combined use of maternal factors and laboratory data could effectively predict individual risk of GDM using a machine learning model.

摘要

本研究利用韩国多中心的 34387 例妊娠回顾性数据,开发了一种机器学习算法来预测妊娠糖尿病(GDM)。基线、E0(直至 10 孕周)、E1(11-13 孕周)和 M1(14-24 孕周)时收集变量。数据集随机分为训练集和测试集(比例为 7:3),以比较轻梯度提升机(LGBM)和极端梯度提升(XGBoost)算法的性能,使用全套变量(原始)。具有全队列的预测模型在基线时获得了 0.711 的接收器工作特征曲线下面积(AUC)和 0.246 的精度召回曲线下面积(AUPR)值,在 E0 时获得了 0.720 和 0.256,在 E1 时获得了 0.721 和 0.262,在 M1 时获得了 0.804 和 0.442。然后对具有不同变量集的三个模型进行了比较:[a]来自临床指南的变量;[b]Shapley 加法解释(SHAP)值选择的变量;[c]Boruta 算法。基于具有最少变量且性能与其他模型相似或更好的模型[c],开发了简单的问卷。使用机器学习模型,结合母体因素和实验室数据,可以有效预测个体患 GDM 的风险。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6da0/10432552/36ffcf9a97f1/41598_2023_39680_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验