Department of Oral and Maxillofacial Surgery, Korea University College of Medicine, Korea University Anam Hospital, Seoul 02841, Republic of Korea.
Department of Obstetrics and Gynecology, Korea University College of Medicine, Korea University Anam Hospital, Seoul 02841, Republic of Korea.
Int J Environ Res Public Health. 2023 Jan 18;20(3):1732. doi: 10.3390/ijerph20031732.
This study uses machine learning with large-scale population data to assess the associations of preterm birth (PTB) with dental and gastrointestinal diseases.
Population-based retrospective cohort data came from Korea National Health Insurance claims for 124,606 primiparous women aged 25-40 and delivered in 2017. The 186 independent variables included demographic/socioeconomic determinants, disease information, and medication history. Machine learning analysis was used to establish the prediction model of PTB. Random forest variable importance was used for identifying major determinants of PTB and testing its associations with dental and gastrointestinal diseases, medication history, and socioeconomic status.
The random forest with oversampling data registered an accuracy of 84.03, and the areas under the receiver-operating-characteristic curves with the range of 84.03-84.04. Based on random forest variable importance with oversampling data, PTB has strong associations with socioeconomic status (0.284), age (0.214), year 2014 gastroesophageal reflux disease (GERD) (0.026), year 2015 GERD (0.026), year 2013 GERD (0.024), progesterone (0.024), year 2012 GERD (0.023), year 2011 GERD (0.021), tricyclic antidepressant (0.020) and year 2016 infertility (0.019). For example, the accuracy of the model will decrease by 28.4%, 2.6%, or 1.9% if the values of socioeconomic status, year 2014 GERD, or year 2016 infertility are randomly permutated (or shuffled).
By using machine learning, we established a valid prediction model for PTB. PTB has strong associations with GERD and infertility. Pregnant women need close surveillance for gastrointestinal and obstetric risks at the same time.
本研究利用机器学习和大规模人群数据评估早产(PTB)与口腔和胃肠道疾病的关联。
基于人群的回顾性队列数据来自于 2017 年韩国全国健康保险索赔的 124606 名 25-40 岁初产妇。186 个独立变量包括人口统计学/社会经济决定因素、疾病信息和用药史。采用机器学习分析建立 PTB 预测模型。随机森林变量重要性用于确定 PTB 的主要决定因素,并检验其与口腔和胃肠道疾病、用药史和社会经济地位的关联。
采用过采样数据的随机森林准确率为 84.03%,接收者操作特征曲线下面积在 84.03-84.04 范围内。基于过采样数据的随机森林变量重要性,PTB 与社会经济地位(0.284)、年龄(0.214)、2014 年胃食管反流病(GERD)(0.026)、2015 年 GERD(0.026)、2013 年 GERD(0.024)、孕酮(0.024)、2012 年 GERD(0.023)、2011 年 GERD(0.021)、三环类抗抑郁药(0.020)和 2016 年不孕(0.019)具有较强的关联。例如,如果社会经济地位、2014 年 GERD 或 2016 年不孕的数值随机排列(或打乱),模型的准确率将分别下降 28.4%、2.6%或 1.9%。
通过使用机器学习,我们建立了一个有效的 PTB 预测模型。PTB 与 GERD 和不孕有很强的关联。孕妇需要同时密切监测胃肠道和产科风险。