Ministry of Education-Shanghai Key Laboratory of Children's Environmental Health, Department of Pediatrics, Xinhua Hospital, Early Life Health Institute, Shanghai Jiao-Tong University School of Medicine, Kong-Jiang Road, Shanghai, 200092, China.
Prosserman Centre for Population Health Research, Department of Obstetrics and Gynecology, Mount Sinai Hospital, Faculty of Medicine, Lunenfeld-Tanenbaum Research Institute, University of Toronto, L5-240, Murray Street 60, Toronto, ON, M5T 3H7, Canada.
BMC Pregnancy Childbirth. 2024 Sep 16;24(1):601. doi: 10.1186/s12884-024-06651-4.
It remains unclear which early gestational biomarkers can be used in predicting later development of gestational diabetes mellitus (GDM). We sought to identify the optimal combination of early gestational biomarkers in predicting GDM in machine learning (ML) models.
This was a nested case-control study including 100 pairs of GDM and euglycemic (control) pregnancies in the Early Life Plan cohort in Shanghai, China. High sensitivity C reactive protein, sex hormone binding globulin, insulin-like growth factor I, IGF binding protein 2 (IGFBP-2), total and high molecular weight adiponectin and glycosylated fibronectin concentrations were measured in serum samples at 11-14 weeks of gestation. Routine first-trimester blood test biomarkers included fasting plasma glucose (FPG), serum lipids and thyroid hormones. Five ML models [stepwise logistic regression, least absolute shrinkage and selection operator (LASSO), random forest, support vector machine and k-nearest neighbor] were employed to predict GDM. The study subjects were randomly split into two sets for model development (training set, n = 70 GDM/control pairs) and validation (testing set: n = 30 GDM/control pairs). Model performance was evaluated by the area under the curve (AUC) in receiver operating characteristics.
FPG and IGFBP-2 were consistently selected as predictors of GDM in all ML models. The random forest model including FPG and IGFBP-2 performed the best (AUC 0.80, accuracy 0.72, sensitivity 0.87, specificity 0.57). Adding more predictors did not improve the discriminant power.
The combination of FPG and IGFBP-2 at early gestation (11-14 weeks) could predict later development of GDM with moderate discriminant power. Further validation studies are warranted to assess the utility of this simple combination model in other independent cohorts.
目前尚不清楚哪些早期妊娠生物标志物可用于预测妊娠糖尿病(GDM)的后期发展。我们试图在机器学习(ML)模型中确定预测 GDM 的最佳早期妊娠生物标志物组合。
这是一项嵌套病例对照研究,纳入了中国上海早期生活计划队列中的 100 对 GDM 和血糖正常(对照)妊娠。在妊娠 11-14 周时,检测血清样本中的高敏 C 反应蛋白、性激素结合球蛋白、胰岛素样生长因子 I、IGF 结合蛋白 2(IGFBP-2)、总和高分子量脂联素和糖化纤维连接蛋白浓度。常规的早孕期血液检查生物标志物包括空腹血糖(FPG)、血脂和甲状腺激素。采用 5 种 ML 模型[逐步逻辑回归、最小绝对收缩和选择算子(LASSO)、随机森林、支持向量机和 k-最近邻]预测 GDM。研究对象随机分为两组进行模型开发(训练集,n=70 例 GDM/对照组)和验证(测试集:n=30 例 GDM/对照组)。通过接受者操作特征曲线下的面积(AUC)评估模型性能。
在所有 ML 模型中,FPG 和 IGFBP-2 均被一致选为 GDM 的预测因子。包括 FPG 和 IGFBP-2 的随机森林模型表现最佳(AUC 0.80,准确性 0.72,敏感性 0.87,特异性 0.57)。添加更多的预测因子并不能提高鉴别能力。
在妊娠早期(11-14 周),FPG 和 IGFBP-2 的组合可以预测 GDM 的后期发展,具有中等的鉴别能力。需要进一步的验证研究来评估该简单组合模型在其他独立队列中的实用性。