通过约束权重优化策略的贝叶斯字母表集成提高了基因组预测准确性。

Ensemble of Bayesian alphabets via constraint weight optimization strategy improves genomic prediction accuracy.

作者信息

Meher Prabina Kumar, Pradhan Upendra Kumar, Ray Mrinmoy, Gupta Ajit, Parsad Rajender, Gupta Pushpendra Kumar

机构信息

Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India.

Division of Forecasting and Agricultural Systems Modeling, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India.

出版信息

G3 (Bethesda). 2025 Sep 3;15(9). doi: 10.1093/g3journal/jkaf150.

DOI:10.1093/g3journal/jkaf150

PMID:40728237

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12405891/

Abstract

This study proposes a weight optimization-based ensemble framework aimed at improving genomic prediction accuracy. It incorporates 8 Bayesian models-BayesA, BayesB, BayesC, BayesBpi, BayesCpi, BayesR, BayesL, and BayesRR in the ensemble framework, where the weight assigned to each model was optimized using genetic algorithm method. The performance of the ensemble model, named EnBayes, was evaluated on 18 datasets from 4 crop species, showing improved prediction accuracy compared to individual Bayesian models. New objective functions were proposed to improve prediction accuracy in terms of both Pearson's correlation coefficient and mean square error. The accuracy of the ensemble model was found to be associated with the number of models considered in the framework, where a few more accurate models achieved similar accuracy as that of more number of less accurate models. Additionally, over-bias and under-bias models also influenced the biasness of the ensemble model's accuracy. The study also explored a meta-learning approach using Bayesian models as base learners and random forest, quantile regression forest, and ridge regression as meta-learners, with the EnBayes model outperforming this approach. While traditional genomic prediction models GBLUP and rrBLUP and machine learning models support vector machine, random forest, extreme gradient boosting, and light gradient boosting were included in the ensemble framework in addition to Bayesian models, the ensemble model achieved higher accuracy as compared to the individual Bayesian, BLUP, and machine learning models. We believe that EnBayes would contribute significantly to ongoing efforts on improving genomic prediction accuracy.

摘要

本研究提出了一种基于权重优化的集成框架，旨在提高基因组预测准确性。该框架纳入了8个贝叶斯模型——贝叶斯A、贝叶斯B、贝叶斯C、贝叶斯Bpi、贝叶斯Cpi、贝叶斯R、贝叶斯L和贝叶斯RR，其中每个模型的权重使用遗传算法进行优化。名为EnBayes的集成模型在来自4种作物的18个数据集上进行了评估，与单个贝叶斯模型相比，预测准确性有所提高。提出了新的目标函数，以在皮尔逊相关系数和均方误差方面提高预测准确性。发现集成模型的准确性与框架中考虑的模型数量有关，一些更准确的模型实现了与更多不太准确的模型相似的准确性。此外，过偏和欠偏模型也影响了集成模型准确性的偏差。该研究还探索了一种元学习方法，使用贝叶斯模型作为基学习器，随机森林、分位数回归森林和岭回归作为元学习器，结果表明EnBayes模型优于这种方法。除了贝叶斯模型外，传统的基因组预测模型GBLUP和rrBLUP以及机器学习模型支持向量机、随机森林、极端梯度提升和轻梯度提升也被纳入集成框架，与单个贝叶斯、BLUP和机器学习模型相比，集成模型实现了更高的准确性。我们相信，EnBayes将为正在进行的提高基因组预测准确性的努力做出重大贡献。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbac/12405891/0425b6e48f6d/jkaf150f1.jpg

相似文献

Ensemble of Bayesian alphabets via constraint weight optimization strategy improves genomic prediction accuracy.通过约束权重优化策略的贝叶斯字母表集成提高了基因组预测准确性。

G3 (Bethesda). 2025 Sep 3;15(9). doi: 10.1093/g3journal/jkaf150.

Supervised Machine Learning Models for Predicting Sepsis-Associated Liver Injury in Patients With Sepsis: Development and Validation Study Based on a Multicenter Cohort Study.用于预测脓毒症患者脓毒症相关肝损伤的监督式机器学习模型：基于多中心队列研究的开发与验证研究

J Med Internet Res. 2025 May 26;27:e66733. doi: 10.2196/66733.

Investigating the Performance of Frequentist and Bayesian Techniques in Genomic Evaluation.探究频率学派和贝叶斯方法在基因组评估中的性能。

Biochem Genet. 2024 Jul 1. doi: 10.1007/s10528-024-10842-1.

Development of Machine Learning-based Algorithms to Predict the 2- and 5-year Risk of TKA After Tibial Plateau Fracture Treatment.基于机器学习的算法用于预测胫骨平台骨折治疗后2年和5年全膝关节置换风险的研究进展

Clin Orthop Relat Res. 2025 Mar 12. doi: 10.1097/CORR.0000000000003442.

Optimized prediction of diabetes complications using ensemble learning with Bayesian optimization: a cost-efficient laboratory-based approach.使用贝叶斯优化的集成学习优化糖尿病并发症预测：一种基于实验室的经济高效方法。

Front Endocrinol (Lausanne). 2025 Jun 20;16:1593068. doi: 10.3389/fendo.2025.1593068. eCollection 2025.

Consequences of ignoring dominance genetic effects from genomic selection model for discrete threshold traits.忽略基因组选择模型中离散阈值性状的显性遗传效应的后果。

Sci Rep. 2025 Aug 13;15(1):29693. doi: 10.1038/s41598-025-14877-1.

Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗？

Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.

Establishment and validation of an interactive artificial intelligence platform to predict postoperative ambulatory status for patients with metastatic spinal disease: a multicenter analysis.建立和验证交互式人工智能平台，以预测转移性脊柱疾病患者的术后活动状态：一项多中心分析。

Int J Surg. 2024 May 1;110(5):2738-2756. doi: 10.1097/JS9.0000000000001169.

Stabilizing machine learning for reproducible and explainable results: A novel validation approach to subject-specific insights.稳定机器学习以获得可重复和可解释的结果：一种针对特定个体见解的新型验证方法。

Comput Methods Programs Biomed. 2025 Jun 21;269:108899. doi: 10.1016/j.cmpb.2025.108899.

本文引用的文献

Enhancing genomic prediction with Stacking Ensemble Learning in Arabica Coffee.利用堆叠集成学习增强阿拉比卡咖啡的基因组预测

Front Plant Sci. 2024 Jul 17;15:1373318. doi: 10.3389/fpls.2024.1373318. eCollection 2024.

Ensemble learning for integrative prediction of genetic values with genomic variants.基于基因组变异的遗传值综合预测的集成学习。

BMC Bioinformatics. 2024 Mar 21;25(1):120. doi: 10.1186/s12859-024-05720-x.

A Stacking Ensemble Learning Framework for Genomic Prediction.一种用于基因组预测的堆叠集成学习框架。

Front Genet. 2021 Mar 4;12:600040. doi: 10.3389/fgene.2021.600040. eCollection 2021.

Status and prospects of genome-wide association studies in plants.植物全基因组关联研究的现状与展望。

Plant Genome. 2021 Mar;14(1):e20077. doi: 10.1002/tpg2.20077. Epub 2021 Jan 13.

Enhancing Genetic Gain through Genomic Selection: From Livestock to Plants.通过基因组选择提高遗传增益：从家畜到植物。

Plant Commun. 2019 Oct 16;1(1):100005. doi: 10.1016/j.xplc.2019.100005. eCollection 2020 Jan 13.

A review on genetic algorithm: past, present, and future.关于遗传算法的综述：过去、现在与未来。

Multimed Tools Appl. 2021;80(5):8091-8126. doi: 10.1007/s11042-020-10139-6. Epub 2020 Oct 31.

Understanding the classics: the unifying concepts of transgressive segregation, inbreeding depression and heterosis and their central relevance for crop breeding.理解经典：越界隔离、近交衰退和杂种优势的统一概念及其在作物育种中的核心相关性。

Plant Biotechnol J. 2021 Jan;19(1):26-34. doi: 10.1111/pbi.13481. Epub 2020 Oct 15.

Genome-based trait prediction in multi- environment breeding trials in groundnut.基于基因组的性状预测在花生的多环境育种试验中。

Theor Appl Genet. 2020 Nov;133(11):3101-3117. doi: 10.1007/s00122-020-03658-1. Epub 2020 Aug 18.

Optimizing genomic prediction for Australian Red dairy cattle.优化澳大利亚红奶牛的基因组预测。

J Dairy Sci. 2020 Jul;103(7):6276-6298. doi: 10.3168/jds.2019-17914. Epub 2020 Apr 22.

Comparison of Bayesian and partial least squares regression methods for mid-infrared prediction of cheese-making properties in Montbéliarde cows.贝叶斯和偏最小二乘回归方法在蒙贝利亚尔牛乳制品性中红外预测的比较。

J Dairy Sci. 2019 Aug;102(8):6943-6958. doi: 10.3168/jds.2019-16320. Epub 2019 Jun 6.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

通过约束权重优化策略的贝叶斯字母表集成提高了基因组预测准确性。

Ensemble of Bayesian alphabets via constraint weight optimization strategy improves genomic prediction accuracy.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献