Instituto Bernabéu Alicante, Avda. Albufereta, 31, 03016, Alicante, Spain.
Instituto Bernabéu Madrid, Madrid, Spain.
Reprod Biol Endocrinol. 2024 Sep 11;22(1):116. doi: 10.1186/s12958-024-01285-9.
Data sciences and artificial intelligence are becoming encouraging tools in assisted reproduction, favored by time-lapse technology incubators. Our objective is to analyze, compare and identify the most predictive machine learning algorithm developed using a known implantation database of embryos transferred in our egg donation program, including morphokinetic and morphological variables, and recognize the most predictive embryo parameters in order to enhance IVF treatments clinical outcomes.
Multicenter retrospective cohort study carried out in 378 egg donor recipients who performed a fresh single embryo transfer during 2021. All treatments were performed by Intracytoplasmic Sperm Injection, using fresh or frozen oocytes. The embryos were cultured in Geri® time-lapse incubators until transfer on day 5. The embryonic morphokinetic events of 378 blastocysts with known implantation and live birth were analyzed. Classical statistical analysis (binary logistic regression) and 10 machine learning algorithms were applied including Multi-Layer Perceptron, Support Vector Machines, k-Nearest Neighbor, Cart and C0.5 Classification Trees, Random Forest (RF), AdaBoost Classification Trees, Stochastic Gradient boost, Bagged CART and eXtrem Gradient Boosting. These algorithms were developed and optimized by maximizing the area under the curve.
The Random Forest emerged as the most predictive algorithm for implantation (area under the curve, AUC = 0.725, IC 95% [0.6232-0826]). Overall, implantation and miscarriage rates stood at 56.08% and 18.39%, respectively. Overall live birth rate was 41.26%. Significant disparities were observed regarding time to hatching out of the zona pellucida (p = 0.039). The Random Forest algorithm demonstrated good predictive capabilities for live birth (AUC = 0.689, IC 95% [0.5821-0.7921]), but the AdaBoost classification trees proved to be the most predictive model for live birth (AUC = 0.749, IC 95% [0.6522-0.8452]). Other important variables with substantial predictive weight for implantation and live birth were duration of visible pronuclei (DESAPPN-APPN), synchronization of cleavage patterns (T8-T5), duration of compaction (TM-TiCOM), duration of compaction until first sign of cavitation (TiCAV-TM) and time to early compaction (TiCOM).
This study highlights Random Forest and AdaBoost as the most effective machine learning models in our Known Implantation and Live Birth Database from our egg donation program. Notably, time to blastocyst hatching out of the zona pellucida emerged as a highly reliable parameter significantly influencing our implantation machine learning predictive models. Processes involving syngamy, genomic imprinting during embryo cleavage, and embryo compaction are also influential and could be crucial for implantation and live birth outcomes.
数据科学和人工智能正在成为辅助生殖领域令人鼓舞的工具,受到时差培养箱的青睐。我们的目标是分析、比较和识别使用我们卵子捐赠计划中已知的胚胎移植植入数据库开发的最具预测性的机器学习算法,包括形态动力学和形态学变量,并识别最具预测性的胚胎参数,以提高体外受精治疗的临床结果。
对 2021 年进行新鲜单胚胎移植的 378 名卵子捐赠接受者进行了多中心回顾性队列研究。所有治疗均采用胞浆内精子注射,使用新鲜或冷冻卵子。胚胎在 Geri®时差培养箱中培养至第 5 天转移。分析了 378 个已知植入和活产的囊胚的胚胎形态动力学事件。应用经典统计学分析(二项逻辑回归)和 10 种机器学习算法,包括多层感知机、支持向量机、k-最近邻、Cart 和 C0.5 分类树、随机森林(RF)、AdaBoost 分类树、随机梯度提升、袋装 Cart 和极端梯度提升。这些算法通过最大化曲线下面积来开发和优化。
随机森林成为最具植入预测能力的算法(曲线下面积,AUC=0.725,95%置信区间[0.6232-0826])。总体而言,植入和流产率分别为 56.08%和 18.39%。总活产率为 41.26%。在透明带孵化时间方面存在显著差异(p=0.039)。随机森林算法对活产具有良好的预测能力(AUC=0.689,95%置信区间[0.5821-0.7921]),但 AdaBoost 分类树被证明是活产最具预测能力的模型(AUC=0.749,95%置信区间[0.6522-0.8452])。其他对植入和活产具有重要预测权重的重要变量包括可见原核持续时间(DESAPPN-APPN)、卵裂模式同步性(T8-T5)、致密化持续时间(TM-TiCOM)、致密化直至首次出现空化的持续时间(TiCAV-TM)和早期致密化持续时间(TiCOM)。
本研究强调了随机森林和 AdaBoost 是我们卵子捐赠计划中已知植入和活产数据库中最有效的机器学习模型。值得注意的是,透明带孵化的囊胚孵化时间是一个非常可靠的参数,对我们的植入机器学习预测模型有重要影响。涉及合子形成、胚胎卵裂过程中的基因组印迹以及胚胎致密化的过程也具有影响力,可能对植入和活产结局至关重要。