Suppr超能文献

整合机器学习与全基因组关联研究以探究燕麦(Avena sativa L.)农艺性状的基因组预测准确性。

Integration of machine learning and genome-wide association study to explore the genomic prediction accuracy of agronomic trait in oats (Avena sativa L.).

作者信息

Peng Jinghan, Lei Xiong, Liu Tianqi, Xiong Yi, Wu Jiqiang, Xiong Yanli, You Minghong, Zhao Junming, Zhang Jian, Ma Xiao

机构信息

College of Grassland Science and Technology, Sichuan Agricultural University, Chengdu, China.

Sichuan Academy of Grassland Science, Chengdu, China.

出版信息

Plant Genome. 2025 Mar;18(1):e20549. doi: 10.1002/tpg2.20549.

Abstract

Machine learning (ML) has garnered significant attention for its potential to enhance the accuracy of genomic predictions (GPs) in various economic crops with the use of complete genomic information. Genome-wide association studies (GWAS) are widely used to pinpoint trait-related causal variant loci in genomes. However, the simultaneous integration of both methods for crop genome prediction necessitates further research. In this study, we integrated ML and GWAS to assess the efficiency of GP for seven key agronomic traits in 195 oat (Avena sativa) cultivars from major oat-growing regions around the world. A total of 94 trait-associated single nucleotide polymorphisms were identified through the GWAS study. GP studies were conducted using the classical model genomic best linear unbiased prediction (GBLUP) and six ML models. GBLUP performed poorly in predicting all traits except flag leaf width, while none of the ML models consistently provided the best prediction accuracy across all traits. The prediction accuracy of the GWAS-derived markers was better than that of the use of genome-wide markers, and plant height had the highest prediction rate at 100 GWAS-derived markers, and the rest of the traits for which more markers were required. These results play an important role in advancing the use of GP in small oat breeding programs by optimizing the prediction rate of GP and reducing the number of markers, confirming that high prediction rates can be achieved with smaller datasets.

摘要

机器学习(ML)因其利用完整基因组信息提高各种经济作物基因组预测(GP)准确性的潜力而备受关注。全基因组关联研究(GWAS)被广泛用于确定基因组中与性状相关的因果变异位点。然而,将这两种方法同时用于作物基因组预测仍需进一步研究。在本研究中,我们整合了ML和GWAS,以评估来自世界主要燕麦种植区的195个燕麦(Avena sativa)品种的七个关键农艺性状的GP效率。通过GWAS研究共鉴定出94个与性状相关的单核苷酸多态性。使用经典模型基因组最佳线性无偏预测(GBLUP)和六个ML模型进行了GP研究。GBLUP在预测除旗叶宽度外的所有性状时表现不佳,而没有一个ML模型在所有性状上都始终提供最佳预测准确性。GWAS衍生标记的预测准确性优于全基因组标记的使用,在100个GWAS衍生标记时株高的预测率最高,其他性状则需要更多标记。这些结果通过优化GP的预测率和减少标记数量,在推进GP在小燕麦育种计划中的应用方面发挥了重要作用,证实了使用较小数据集也能实现高预测率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0045/11711298/7b49094fdf8f/TPG2-18-e20549-g002.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验