Department of Medical Statistics and Medical Informatics, Istanbul Medipol University, Faculty of Medicine Istanbul, Turkey.
Department of Medical Statistics and Medical Informatics, Istanbul Medipol University, Faculty of Medicine Istanbul, Turkey; Department of Evidence for Population Health Unit, School of Epidemiology and Health Sciences, The University of Manchester Manchester, UK.
Reprod Biomed Online. 2022 Nov;45(5):923-934. doi: 10.1016/j.rbmo.2022.06.022. Epub 2022 Jun 28.
Which machine learning model predicts the implantation outcome better in an IVF cycle? What is the importance of each variable in predicting the implantation outcome in an IVF cycle?
Retrospective cohort study comprising 939 transferred embryos between 2014 and 2018 in an IVF centre in Turkey with 17 selected features. The algorithms were Logistic Regression (LR), Decision Tree (DT), Naïve Bayes (NB), Random Forest (RF), Support Vector Machine (SVM), Neural Network (Nnet), Gradient Boost Decision Tree (GBDT), eXtreme Gradient Boosting (XGBoost) and Super Learner (SL). The results were evaluated with performance metrics (F1 score, specificity, accuracy and area under the receiver operating characteristic curve [AUROC]) with 10-fold cross-validation repeated ten times.
RF and SL models achieved the highest performance and showed F1 scores of 74% and 73%, specificity of 94%, an accuracy of 89%, and AUROC of 83%. In addition, the model identified the top features as maternal age, embryo transfer day, total gonadotrophin dose and oestradiol concentration.
The present study revealed that machine learning algorithms successfully predicted implantation rates in an IVF attempt. In addition, maternal age is by far the most important predictor of IVF success when compared with other variables.
哪种机器学习模型能更好地预测试管婴儿周期中的胚胎着床结果?在预测试管婴儿周期中的胚胎着床结果时,各变量的重要性如何?
这是一项在土耳其一家试管婴儿中心进行的回顾性队列研究,纳入了 2014 年至 2018 年间的 939 枚移植胚胎,共选取了 17 个特征。研究使用的算法包括逻辑回归(LR)、决策树(DT)、朴素贝叶斯(NB)、随机森林(RF)、支持向量机(SVM)、神经网络(Nnet)、梯度提升决策树(GBDT)、极端梯度提升(XGBoost)和超级学习者(SL)。通过 10 折交叉验证重复 10 次的方法评估了这些算法的性能指标(F1 评分、特异性、准确性和受试者工作特征曲线下面积 [AUROC])。
RF 和 SL 模型的性能最佳,F1 评分分别为 74%和 73%,特异性为 94%,准确性为 89%,AUROC 为 83%。此外,模型还确定了最重要的特征为母亲年龄、胚胎移植日、总促性腺激素剂量和雌二醇浓度。
本研究表明,机器学习算法可以成功预测试管婴儿尝试中的胚胎着床率。此外,与其他变量相比,母亲年龄是迄今为止预测试管婴儿成功的最重要因素。