使用常规临床数据的机器学习方法与传统逻辑回归预测妊娠期糖尿病的比较：一项回顾性队列研究。

Comparison of Machine Learning Methods and Conventional Logistic Regressions for Predicting Gestational Diabetes Using Routine Clinical Data: A Retrospective Cohort Study.

机构信息

Obstetrics and Gynecology Hospital, Fudan University, Shanghai, China.

The Shanghai Key Laboratory of Female Reproductive Endocrine-Related Diseases, Shanghai, China.

出版信息

J Diabetes Res. 2020 Jun 12;2020:4168340. doi: 10.1155/2020/4168340. eCollection 2020.

DOI:10.1155/2020/4168340

PMID:32626780

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7306091/

Abstract

BACKGROUND

Gestational diabetes mellitus (GDM) contributes to adverse pregnancy and birth outcomes. In recent decades, extensive research has been devoted to the early prediction of GDM by various methods. Machine learning methods are flexible prediction algorithms with potential advantages over conventional regression.

OBJECTIVE

The purpose of this study was to use machine learning methods to predict GDM and compare their performance with that of logistic regressions.

METHODS

We performed a retrospective, observational study including women who attended their routine first hospital visits during early pregnancy and had Down's syndrome screening at 16-20 gestational weeks in a tertiary maternity hospital in China from 2013.1.1 to 2017.12.31. A total of 22,242 singleton pregnancies were included, and 3182 (14.31%) women developed GDM. Candidate predictors included maternal demographic characteristics and medical history (maternal factors) and laboratory values at early pregnancy. The models were derived from the first 70% of the data and then validated with the next 30%. Variables were trained in different machine learning models and traditional logistic regression models. Eight common machine learning methods (GDBT, AdaBoost, LGB, Logistic, Vote, XGB, Decision Tree, and Random Forest) and two common regressions (stepwise logistic regression and logistic regression with RCS) were implemented to predict the occurrence of GDM. Models were compared on discrimination and calibration metrics.

RESULTS

In the validation dataset, the machine learning and logistic regression models performed moderately (AUC 0.59-0.74). Overall, the GBDT model performed best (AUC 0.74, 95% CI 0.71-0.76) among the machine learning methods, with negligible differences between them. Fasting blood glucose, HbA1c, triglycerides, and BMI strongly contributed to GDM. A cutoff point for the predictive value at 0.3 in the GBDT model had a negative predictive value of 74.1% (95% CI 69.5%-78.2%) and a sensitivity of 90% (95% CI 88.0%-91.7%), and the cutoff point at 0.7 had a positive predictive value of 93.2% (95% CI 88.2%-96.1%) and a specificity of 99% (95% CI 98.2%-99.4%).

CONCLUSION

In this study, we found that several machine learning methods did not outperform logistic regression in predicting GDM. We developed a model with cutoff points for risk stratification of GDM.

摘要

背景

妊娠糖尿病（GDM）会导致不良的妊娠和分娩结局。近几十年来，人们已经用各种方法致力于 GDM 的早期预测。机器学习方法是一种灵活的预测算法，相对于传统回归方法具有潜在优势。

目的

本研究旨在使用机器学习方法预测 GDM，并比较其与逻辑回归的性能。

方法

我们进行了一项回顾性、观察性研究，纳入了 2013 年 1 月 1 日至 2017 年 12 月 31 日在中国一家三级妇产医院接受常规首次产前检查且在 16-20 孕周进行唐氏综合征筛查的单胎妊娠女性。共纳入 22242 例单胎妊娠，其中 3182 例（14.31%）女性发生 GDM。候选预测因素包括母亲的人口统计学特征和病史（母亲因素）以及孕早期的实验室值。模型基于数据的前 70%得出，然后使用后 30%进行验证。在不同的机器学习模型和传统逻辑回归模型中对变量进行训练。实施了八种常见的机器学习方法（GDBT、AdaBoost、LGB、Logistic、Vote、XGB、Decision Tree 和 Random Forest）和两种常见的回归方法（逐步逻辑回归和 RCS 逻辑回归）来预测 GDM 的发生。通过判别和校准指标比较模型。

结果

在验证数据集中，机器学习和逻辑回归模型的表现中等（AUC 0.59-0.74）。总体而言，在机器学习方法中，GBDT 模型表现最佳（AUC 0.74，95%CI 0.71-0.76），但彼此之间差异不大。空腹血糖、HbA1c、甘油三酯和 BMI 对 GDM 有重要影响。GBDT 模型中预测值为 0.3 的截断点的阴性预测值为 74.1%（95%CI 69.5%-78.2%），灵敏度为 90%（95%CI 88.0%-91.7%），预测值为 0.7 的截断点的阳性预测值为 93.2%（95%CI 88.2%-96.1%），特异性为 99%（95%CI 98.2%-99.4%）。

结论

本研究发现，在预测 GDM 方面，几种机器学习方法并未优于逻辑回归。我们开发了一种具有 GDM 风险分层截断点的模型。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

使用常规临床数据的机器学习方法与传统逻辑回归预测妊娠期糖尿病的比较：一项回顾性队列研究。

Comparison of Machine Learning Methods and Conventional Logistic Regressions for Predicting Gestational Diabetes Using Routine Clinical Data: A Retrospective Cohort Study.

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSION

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

使用常规临床数据的机器学习方法与传统逻辑回归预测妊娠期糖尿病的比较：一项回顾性队列研究。

Comparison of Machine Learning Methods and Conventional Logistic Regressions for Predicting Gestational Diabetes Using Routine Clinical Data: A Retrospective Cohort Study.

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSION

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献