• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ExAutoGP:通过自动机器学习和SHAP增强基因组预测的稳定性和可解释性

ExAutoGP: Enhancing Genomic Prediction Stability and Interpretability with Automated Machine Learning and SHAP.

作者信息

Rao Yao, Zhang Lilian, Gao Lutao, Wang Shuran, Yang Linnan

机构信息

College of Big Data, Yunnan Agricultural University, Kunming 650201, China.

Yunnan Engineering Technology Research Center of Agricultural Big Data, Kunming 650201, China.

出版信息

Animals (Basel). 2025 Apr 18;15(8):1172. doi: 10.3390/ani15081172.

DOI:10.3390/ani15081172
PMID:40282006
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12024354/
Abstract

Machine learning has attracted much attention in the field of genomic prediction due to its powerful predictive capabilities, yet the lack of an explanatory nature in modeling decisions remains a major challenge. In this study, we propose a novel machine learning method, ExAutoGP, which aims to improve the accuracy of genomic prediction and enhance the transparency of the model by combining automated machine learning (AutoML) with SHapley Additive exPlanations (SHAP). To evaluate ExAutoGP's effectiveness, we designed a comparative experiment consisting of a simulated dataset and two real animal datasets. For each dataset, we applied ExAutoGP and five baseline models-Genomic Best Linear Unbiased Prediction (GBLUP), BayesB, Support Vector Regression (SVR), Kernel Ridge Regression (KRR), and Random Forest (RF). All models were trained and evaluated using five repeated five-fold cross-validation, and their performance was assessed based on both predictive accuracy and computational efficiency. The results show that ExAutoGP exhibits robust and excellent prediction performance on all datasets. In addition, the SHAP method not only effectively reveals the decision-making process of ExAutoGP and enhances its interpretability, but also identifies genetic markers closely related to the traits. This study demonstrates the strong potential of AutoML in genomic prediction, while the introduction of SHAP provides actionable biological insights. The synergy of high prediction accuracy and interpretability offers new perspectives for optimizing genomic selection strategies in livestock and poultry breeding.

摘要

机器学习因其强大的预测能力在基因组预测领域备受关注,但其建模决策缺乏可解释性仍是一个重大挑战。在本研究中,我们提出了一种新颖的机器学习方法ExAutoGP,旨在通过将自动化机器学习(AutoML)与Shapley值加法解释(SHAP)相结合,提高基因组预测的准确性并增强模型的透明度。为了评估ExAutoGP的有效性,我们设计了一个对比实验,该实验由一个模拟数据集和两个真实动物数据集组成。对于每个数据集,我们应用了ExAutoGP和五个基线模型——基因组最佳线性无偏预测(GBLUP)、贝叶斯B、支持向量回归(SVR)、核岭回归(KRR)和随机森林(RF)。所有模型均使用五次重复的五折交叉验证进行训练和评估,并基于预测准确性和计算效率对其性能进行评估。结果表明,ExAutoGP在所有数据集上均表现出稳健且出色的预测性能。此外,SHAP方法不仅有效地揭示了ExAutoGP的决策过程并增强了其可解释性,还识别出与性状密切相关的遗传标记。本研究证明了AutoML在基因组预测中的强大潜力,而SHAP的引入提供了可操作的生物学见解。高预测准确性和可解释性的协同作用为优化畜禽育种中的基因组选择策略提供了新的视角。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96cf/12024354/37dbc3aa3c51/animals-15-01172-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96cf/12024354/546c63ea541b/animals-15-01172-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96cf/12024354/5f45853df625/animals-15-01172-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96cf/12024354/48cd07a410d1/animals-15-01172-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96cf/12024354/b3b047bacd16/animals-15-01172-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96cf/12024354/0cff17cba4cf/animals-15-01172-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96cf/12024354/37dbc3aa3c51/animals-15-01172-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96cf/12024354/546c63ea541b/animals-15-01172-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96cf/12024354/5f45853df625/animals-15-01172-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96cf/12024354/48cd07a410d1/animals-15-01172-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96cf/12024354/b3b047bacd16/animals-15-01172-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96cf/12024354/0cff17cba4cf/animals-15-01172-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/96cf/12024354/37dbc3aa3c51/animals-15-01172-g006.jpg

相似文献

1
ExAutoGP: Enhancing Genomic Prediction Stability and Interpretability with Automated Machine Learning and SHAP.ExAutoGP:通过自动机器学习和SHAP增强基因组预测的稳定性和可解释性
Animals (Basel). 2025 Apr 18;15(8):1172. doi: 10.3390/ani15081172.
2
Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters Optimization.通过结合树状 Parzen 估计器进行超参数优化的机器学习改进基因组预测。
Biology (Basel). 2022 Nov 11;11(11):1647. doi: 10.3390/biology11111647.
3
A Stacking Ensemble Learning Framework for Genomic Prediction.一种用于基因组预测的堆叠集成学习框架。
Front Genet. 2021 Mar 4;12:600040. doi: 10.3389/fgene.2021.600040. eCollection 2021.
4
Predicting Treatment Outcomes in Patients with Low Back Pain Using Gene Signature-Based Machine Learning Models.使用基于基因特征的机器学习模型预测腰痛患者的治疗结果。
Pain Ther. 2025 Feb;14(1):359-373. doi: 10.1007/s40122-024-00700-8. Epub 2024 Dec 25.
5
A hybrid approach for modeling bicycle crash frequencies: Integrating random forest based SHAP model with random parameter negative binomial regression model.基于随机森林的 SHAP 模型与随机参数负二项回归模型相结合的自行车碰撞频率建模混合方法。
Accid Anal Prev. 2024 Dec;208:107778. doi: 10.1016/j.aap.2024.107778. Epub 2024 Sep 16.
6
Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs.利用机器学习提高猪繁殖性状基因组预测的准确性。
J Anim Sci Biotechnol. 2022 May 17;13(1):60. doi: 10.1186/s40104-022-00708-0.
7
Genomic selection in pig breeding: comparative analysis of machine learning algorithms.猪育种中的基因组选择:机器学习算法的比较分析
Genet Sel Evol. 2025 Mar 10;57(1):13. doi: 10.1186/s12711-025-00957-3.
8
Predictive ability of multi-population genomic prediction methods of phenotypes for reproduction traits in Chinese and Austrian pigs.中国和奥地利猪繁殖性状表型的多群体基因组预测方法的预测能力。
Genet Sel Evol. 2024 Jun 26;56(1):49. doi: 10.1186/s12711-024-00915-5.
9
An investigation of machine learning methods applied to genomic prediction in yellow-feathered broilers.应用于黄羽肉鸡基因组预测的机器学习方法研究。
Poult Sci. 2025 Jan;104(1):104489. doi: 10.1016/j.psj.2024.104489. Epub 2024 Nov 1.
10
IHCP: interpretable hepatitis C prediction system based on black-box machine learning models.IHCP:基于黑盒机器学习模型的可解释丙型肝炎预测系统。
BMC Bioinformatics. 2023 Sep 6;24(1):333. doi: 10.1186/s12859-023-05456-0.

引用本文的文献

1
Advancing genome-based precision medicine: a review on machine learning applications for rare genetic disorders.推进基于基因组的精准医学:关于机器学习在罕见遗传疾病中的应用综述
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf329.

本文引用的文献

1
Predicting rapid progression in knee osteoarthritis: a novel and interpretable automated machine learning approach, with specific focus on young patients and early disease.预测膝关节骨关节炎的快速进展:一种新颖且可解释的自动化机器学习方法,特别关注年轻患者和早期疾病。
Ann Rheum Dis. 2025 Jan;84(1):124-135. doi: 10.1136/ard-2024-225872. Epub 2025 Jan 2.
2
KPRR: a novel machine learning approach for effectively capturing nonadditive effects in genomic prediction.KPRR:一种有效捕捉基因组预测中非加性效应的新型机器学习方法。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae683.
3
An investigation of machine learning methods applied to genomic prediction in yellow-feathered broilers.
应用于黄羽肉鸡基因组预测的机器学习方法研究。
Poult Sci. 2025 Jan;104(1):104489. doi: 10.1016/j.psj.2024.104489. Epub 2024 Nov 1.
4
A review of machine learning models applied to genomic prediction in animal breeding.应用于动物育种基因组预测的机器学习模型综述。
Front Genet. 2023 Sep 6;14:1150596. doi: 10.3389/fgene.2023.1150596. eCollection 2023.
5
Unveiling the drives behind tetracycline adsorption capacity with biochar through machine learning.通过机器学习揭示生物炭对四环素吸附能力的驱动因素。
Sci Rep. 2023 Jul 17;13(1):11512. doi: 10.1038/s41598-023-38579-8.
6
Automated machine learning (AutoML) can predict 90-day mortality after gastrectomy for cancer.自动化机器学习(AutoML)可预测胃癌手术后 90 天的死亡率。
Sci Rep. 2023 Jul 8;13(1):11051. doi: 10.1038/s41598-023-37396-3.
7
Using machine learning to realize genetic site screening and genomic prediction of productive traits in pigs.利用机器学习实现猪生产性状的遗传位点筛选和基因组预测。
FASEB J. 2023 Jun;37(6):e22961. doi: 10.1096/fj.202300245R.
8
Regression shrinkage and selection via least quantile shrinkage and selection operator.通过最小分位数收缩和选择算子进行回归收缩和选择。
PLoS One. 2023 Feb 16;18(2):e0266267. doi: 10.1371/journal.pone.0266267. eCollection 2023.
9
Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters Optimization.通过结合树状 Parzen 估计器进行超参数优化的机器学习改进基因组预测。
Biology (Basel). 2022 Nov 11;11(11):1647. doi: 10.3390/biology11111647.
10
Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs.利用机器学习提高猪繁殖性状基因组预测的准确性。
J Anim Sci Biotechnol. 2022 May 17;13(1):60. doi: 10.1186/s40104-022-00708-0.