在一个种族多样化人群中，基于机器学习和传统逻辑回归的妊娠期糖尿病预测模型的比较；莫纳什妊娠期糖尿病机器学习模型

Comparison of machine learning and conventional logistic regression-based prediction models for gestational diabetes in an ethnically diverse population; the Monash GDM Machine learning model.

作者信息

Belsti Yitayeh, Moran Lisa, Du Lan, Mousa Aya, De Silva Kushan, Enticott Joanne, Teede Helena

机构信息

Monash Centre for Health Research and Implementation (MCHRI), Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, Australia; University of Gondar, College of Medicine and Health Science, Ethiopia.

Monash Centre for Health Research and Implementation (MCHRI), Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, Australia.

出版信息

Int J Med Inform. 2023 Nov;179:105228. doi: 10.1016/j.ijmedinf.2023.105228. Epub 2023 Sep 21.

DOI:10.1016/j.ijmedinf.2023.105228

PMID:37774429

Abstract

BACKGROUND

Early identification of pregnant women at high risk of developing gestational diabetes (GDM) is desirable as effective lifestyle interventions are available to prevent GDM and to reduce associated adverse outcomes. Personalised probability of developing GDM during pregnancy can be determined using a risk prediction model. These models extend from traditional statistics to machine learning methods; however, accuracy remains sub-optimal.

OBJECTIVE

We aimed to compare multiple machine learning algorithms to develop GDM risk prediction models, then to determine the optimal model for predicting GDM.

METHODS

A supervised machine learning predictive analysis was performed on data from routine antenatal care at a large health service network from January 2016 to June 2021. Predictor set 1 were sourced from the existing, internationally validated Monash GDM model: GDM history, body mass index, ethnicity, age, family history of diabetes, and past poor obstetric history. New models with different predictors were developed, considering statistical principles with inclusion of more robust continuous and derivative variables. A randomly selected 80% dataset was used for model development, with 20% for validation. Performance measures, including calibration and discrimination metrics, were assessed. Decision curve analysis was performed.

RESULTS

Upon internal validation, the machine learning and logistic regression model's area under the curve (AUC) ranged from 71% to 93% across the different algorithms, with the best being the CatBoost Classifier (CBC). Based on the default cut-off point of 0.32, the performance of CBC on predictor set 4 was: Accuracy (85%), Precision (90%), Recall (78%), F1-score (84%), Sensitivity (81%), Specificity (90%), positive predictive value (92%), negative predictive value (78%), and Brier Score (0.39).

CONCLUSIONS

In this study, machine learning approaches achieved the best predictive performance over traditional statistical methods, increasing from 75 to 93%. The CatBoost classifier method achieved the best with the model including continuous variables.

摘要

背景

尽早识别有患妊娠期糖尿病（GDM）高风险的孕妇是很有必要的，因为可以通过有效的生活方式干预来预防GDM并减少相关不良后果。使用风险预测模型可以确定孕期发生GDM的个性化概率。这些模型从传统统计方法扩展到机器学习方法；然而，准确性仍然不尽人意。

目的

我们旨在比较多种机器学习算法以开发GDM风险预测模型，然后确定预测GDM的最佳模型。

方法

对2016年1月至2021年6月期间在一个大型医疗服务网络进行的常规产前检查数据进行监督式机器学习预测分析。预测指标集1源自现有的、经过国际验证的莫纳什GDM模型：GDM病史、体重指数、种族、年龄、糖尿病家族史和既往不良产科史。考虑到统计原则并纳入更稳健的连续变量和派生变量，开发了具有不同预测指标的新模型。随机选取80%的数据集用于模型开发，20%用于验证。评估了包括校准和区分指标在内的性能指标。进行了决策曲线分析。

结果

内部验证时，不同算法下机器学习和逻辑回归模型的曲线下面积（AUC）在71%至93%之间，最佳的是CatBoost分类器（CBC）。基于默认的0.32截止点，CBC在预测指标集4上的性能为：准确率（85%）、精确率（90%）、召回率（78%）、F1分数（84%）、灵敏度（81%）、特异度（90%）、阳性预测值（92%）、阴性预测值（78%）和布里尔评分（0.39）。

结论

在本研究中，机器学习方法比传统统计方法具有最佳的预测性能，从75%提高到了93%。CatBoost分类器方法在包含连续变量的模型中表现最佳。

相似文献

Comparison of machine learning and conventional logistic regression-based prediction models for gestational diabetes in an ethnically diverse population; the Monash GDM Machine learning model.在一个种族多样化人群中，基于机器学习和传统逻辑回归的妊娠期糖尿病预测模型的比较；莫纳什妊娠期糖尿病机器学习模型

Int J Med Inform. 2023 Nov;179:105228. doi: 10.1016/j.ijmedinf.2023.105228. Epub 2023 Sep 21.

Machine Learning-Derived Prenatal Predictive Risk Model to Guide Intervention and Prevent the Progression of Gestational Diabetes Mellitus to Type 2 Diabetes: Prediction Model Development Study.机器学习衍生的产前预测风险模型，用于指导干预并预防妊娠期糖尿病进展为2型糖尿病：预测模型开发研究

JMIR Diabetes. 2022 Jul 5;7(3):e32366. doi: 10.2196/32366.

Comparison of Machine Learning Methods and Conventional Logistic Regressions for Predicting Gestational Diabetes Using Routine Clinical Data: A Retrospective Cohort Study.使用常规临床数据的机器学习方法与传统逻辑回归预测妊娠期糖尿病的比较：一项回顾性队列研究。

J Diabetes Res. 2020 Jun 12;2020:4168340. doi: 10.1155/2020/4168340. eCollection 2020.

An early model to predict the risk of gestational diabetes mellitus in the absence of blood examination indexes: application in primary health care centres.一种在无血液检查指标情况下预测妊娠期糖尿病风险的早期模型：在基层医疗中心的应用。

BMC Pregnancy Childbirth. 2021 Dec 8;21(1):814. doi: 10.1186/s12884-021-04295-2.

Early Prediction of Gestational Diabetes Mellitus in the Chinese Population via Advanced Machine Learning.基于先进机器学习的中国人群妊娠期糖尿病早期预测。

J Clin Endocrinol Metab. 2021 Mar 8;106(3):e1191-e1205. doi: 10.1210/clinem/dgaa899.

Machine learning risk score for prediction of gestational diabetes in early pregnancy in Tianjin, China.基于中国天津地区的早期妊娠孕妇机器学习风险评分预测妊娠期糖尿病

Diabetes Metab Res Rev. 2021 Jul;37(5):e3397. doi: 10.1002/dmrr.3397. Epub 2020 Sep 9.

Machine learning approaches for prediction of early death among lung cancer patients with bone metastases using routine clinical characteristics: An analysis of 19,887 patients.利用常规临床特征预测肺癌伴骨转移患者早期死亡的机器学习方法：对 19887 例患者的分析。

Front Public Health. 2022 Oct 6;10:1019168. doi: 10.3389/fpubh.2022.1019168. eCollection 2022.

Machine learning is an effective method to predict the 90-day prognosis of patients with transient ischemic attack and minor stroke.机器学习是预测短暂性脑缺血发作和小卒中患者 90 天预后的有效方法。

BMC Med Res Methodol. 2022 Jul 16;22(1):195. doi: 10.1186/s12874-022-01672-z.

Development and validation of an early pregnancy risk score for the prediction of gestational diabetes mellitus in Chinese pregnant women.中国孕妇早期妊娠风险评分预测妊娠期糖尿病的开发与验证

BMJ Open Diabetes Res Care. 2020 Apr;8(1). doi: 10.1136/bmjdrc-2019-000909.

Development of machine learning models to predict gestational diabetes risk in the first half of pregnancy.开发机器学习模型以预测妊娠前半期的妊娠期糖尿病风险。

BMC Pregnancy Childbirth. 2023 Jun 23;23(1):469. doi: 10.1186/s12884-023-05766-4.

引用本文的文献

Artificial Intelligence in Gestational Diabetes Care: A Systematic Review.人工智能在妊娠期糖尿病护理中的应用：一项系统综述。

J Diabetes Sci Technol. 2025 Aug 25:19322968251355967. doi: 10.1177/19322968251355967.

Machine Learning Framework for Ovarian Cancer Diagnostics Using Plasma Lipidomics and Metabolomics.基于血浆脂质组学和代谢组学的卵巢癌诊断机器学习框架

Int J Mol Sci. 2025 Jul 10;26(14):6630. doi: 10.3390/ijms26146630.

Advanced Machine Learning did not Surpass Traditional Logistic Regression in First-Trimester Gestational Diabetes Mellitus Prediction: A Retrospective Single-Center Study From Eastern China.在孕早期妊娠糖尿病预测中，先进机器学习未超越传统逻辑回归：来自中国东部的一项回顾性单中心研究

Int J Gen Med. 2025 Apr 26;18:2263-2274. doi: 10.2147/IJGM.S513064. eCollection 2025.

Analyzing electronic medical records to extract prepregnancy morbidities and pregnancy complications: Toward a learning health system.分析电子病历以提取孕前疾病和妊娠并发症：迈向学习型健康系统。

Learn Health Syst. 2024 Nov 26;9(2):e10473. doi: 10.1002/lrh2.10473. eCollection 2025 Apr.

Comparison of Logistic Regression and Machine Learning Approaches in Predicting Depressive Symptoms: A National-Based Study.逻辑回归与机器学习方法在预测抑郁症状中的比较：一项基于全国的研究。

Psychiatry Investig. 2025 Mar;22(3):267-278. doi: 10.30773/pi.2024.0156. Epub 2025 Mar 18.

Machine learning based model for the early detection of Gestational Diabetes Mellitus.基于机器学习的妊娠期糖尿病早期检测模型。

BMC Med Inform Decis Mak. 2025 Mar 13;25(1):130. doi: 10.1186/s12911-025-02947-3.

Predicting low density lipoprotein cholesterol target attainment using machine learning in patients with coronary artery disease receiving moderate-dose statin therapy.在接受中等剂量他汀类药物治疗的冠心病患者中使用机器学习预测低密度脂蛋白胆固醇目标达成情况。

Sci Rep. 2025 Feb 13;15(1):5346. doi: 10.1038/s41598-025-88693-y.

Development and validation of a prediction model for coronary heart disease risk in depressed patients aged 20 years and older using machine learning algorithms.使用机器学习算法开发并验证针对20岁及以上抑郁症患者冠心病风险的预测模型。

Front Cardiovasc Med. 2025 Jan 9;11:1504957. doi: 10.3389/fcvm.2024.1504957. eCollection 2024.

Leveraging Shapley Additive Explanations for Feature Selection in Ensemble Models for Diabetes Prediction.利用夏普利加性解释进行糖尿病预测集成模型中的特征选择。

Bioengineering (Basel). 2024 Nov 30;11(12):1215. doi: 10.3390/bioengineering11121215.

Dietary inflammatory index as a predictor of prediabetes in women with previous gestational diabetes mellitus.饮食炎症指数作为既往有妊娠期糖尿病的女性发生糖尿病前期的预测指标。

Diabetol Metab Syndr. 2024 Nov 6;16(1):265. doi: 10.1186/s13098-024-01486-7.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

在一个种族多样化人群中，基于机器学习和传统逻辑回归的妊娠期糖尿病预测模型的比较；莫纳什妊娠期糖尿病机器学习模型

Comparison of machine learning and conventional logistic regression-based prediction models for gestational diabetes in an ethnically diverse population; the Monash GDM Machine learning model.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献