Liu Yu, Wang Yi, Huang Kai, Shi Hao, Xin Hang, Dai Shanjun, Liu Jinhao, Yang Xinhong, Song Jianyuan, Zhang Fuli, Guo Yihong
Reproductive Medicine Center, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China.
Front Endocrinol (Lausanne). 2025 Jun 12;16:1556681. doi: 10.3389/fendo.2025.1556681. eCollection 2025.
To evaluate the predictive performance of a convolutional neural network for analyzing electronic medical records in assisted reproductive therapy and to compare its accuracy and interpretability with traditional machine learning models. The study also explores the feasibility of deploying such models in resource-limited clinical settings.
Retrospective cohort study based on EMR data using five models: CNN, Naïve Bayes, Random Forest, Decision Tree, and Feedforward Neural Network. Feature importance and model interpretability were evaluated using SHAP.
First Hospital of Zhengzhou University.
48,514 fresh IVF cycles from August 2009 to May 2018.
Preprocessed EMR data were used to train and evaluate five classification models predicting live birth outcomes. Stratified 5-fold cross-validation was performed for robust performance estimation. ROC curves and AUC values were used for comparative evaluation.
Live birth.
The CNN model achieved an accuracy of 0.9394 ± 0.0013, AUC of 0.8899 ± 0.0032, precision of 0.9348 ± 0.0018, recall of 0.9993 ± 0.0012, and F1 score of 0.9660 ± 0.0007. Its performance was comparable to Random Forest (accuracy: 0.9406 ± 0.0017, AUC: 0.9734 ± 0.0012), and superior to Decision Tree, Naïve Bayes, and Feedforward Neural Network in recall and robustness. CNN demonstrated stable convergence during training, and SHAP-based interpretation highlighted maternal age, BMI, antral follicle count, and gonadotropin dosage as the top predictors for live birth outcome.
With appropriate input transformation, CNNs can effectively model structured EMR data and offer predictive performance comparable to ensemble methods. Their scalability, high sensitivity, and interpretability make CNNs promising candidates for integration into clinical workflows, particularly in environments with limited computational resources.
评估卷积神经网络在辅助生殖治疗中分析电子病历的预测性能,并将其准确性和可解释性与传统机器学习模型进行比较。该研究还探讨了在资源有限的临床环境中部署此类模型的可行性。
基于电子病历数据的回顾性队列研究,使用五种模型:卷积神经网络(CNN)、朴素贝叶斯、随机森林、决策树和前馈神经网络。使用SHAP评估特征重要性和模型可解释性。
郑州大学第一附属医院。
2009年8月至2018年5月的48514个新鲜体外受精周期。
使用预处理后的电子病历数据训练和评估预测活产结局的五种分类模型。进行分层5折交叉验证以进行稳健的性能估计。使用ROC曲线和AUC值进行比较评估。
活产。
CNN模型的准确率为0.939±0.0013,AUC为0.8899±0.0032,精确率为0.9348±0.0018,召回率为0.9993±0.0012,F1分数为0.9660±0.0007。其性能与随机森林相当(准确率:0.9406±0.0017,AUC:0.9734±0.0012),在召回率和稳健性方面优于决策树、朴素贝叶斯和前馈神经网络。CNN在训练期间表现出稳定的收敛,基于SHAP的解释突出显示产妇年龄、体重指数、窦卵泡计数和促性腺激素剂量是活产结局的主要预测因素。
通过适当的输入转换,卷积神经网络可以有效地对结构化电子病历数据进行建模,并提供与集成方法相当的预测性能。它们的可扩展性、高灵敏度和可解释性使卷积神经网络成为集成到临床工作流程中的有前途的候选者,特别是在计算资源有限的环境中。