Suppr超能文献

WheatGP,一种基于卷积神经网络(CNN)和长短期记忆网络(LSTM)的基因组预测方法。

WheatGP, a genomic prediction method based on CNN and LSTM.

作者信息

Wang Chunying, Zhang Di, Ma Yuexin, Zhao Yonghao, Liu Ping, Li Xiang

机构信息

State Key Laboratory of Wheat Improvement, Shandong Agricultural University, 61 Daizong Street, Tai'an 271018, China.

Shandong Engineering Research Center of Agricultural Equipment Intelligentization, College of Mechanical and Electronic Engineering, Shandong Agricultural University, 61 Daizong Street, Tai'an 271018, China.

出版信息

Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf191.

Abstract

Wheat plays a crucial role in ensuring food security. However, its complex genetic structure and trait variation pose significant challenges for breeding superior varieties. In this study, a genomic prediction method for wheat (WheatGP) is proposed. WheatGP is designed to improve the phenotype prediction accuracy by modeling both additive genetic effects and epistatic genetic effects. It is primarily composed of a convolutional neural network (CNN) module and a long short-term memory (LSTM) module. The multilayer CNNs within the CNN module focus on capturing short-range dependencies within the genomic sequence. Meanwhile, the LSTM module, with its unique gating mechanism, is designed to retain long-distance dependency relationships between gene loci in the features. Therefore, WheatGP could comprehensively extract multilevel features from genomic inputs. Compared to ridge regression best linear unbiased prediction (rrBLUP), extreme gradient boosting (XGBoost), support vector regression (SVR), and deep neural network genomic prediction (DNNGP), WheatGP demonstrates a clear advantage in terms of prediction accuracy. The prediction accuracy for wheat yield reaches 0.73, while the prediction accuracies for various agronomic traits range between 0.62 and 0.78. It also exhibits robust performance across other crop types and multi-omics datasets. In addition, SHapley Additive exPlanations (SHAP) is employed to evaluate the contributions of inputs to the predictive model. As a high-performance tool for genomic prediction in wheat, WheatGP opens up new possibilities for achieving efficient and optimized wheat breeding.

摘要

小麦在确保粮食安全方面发挥着关键作用。然而,其复杂的遗传结构和性状变异给培育优良品种带来了重大挑战。在本研究中,提出了一种小麦基因组预测方法(WheatGP)。WheatGP旨在通过对加性遗传效应和上位性遗传效应进行建模来提高表型预测准确性。它主要由一个卷积神经网络(CNN)模块和一个长短期记忆(LSTM)模块组成。CNN模块中的多层卷积神经网络专注于捕捉基因组序列中的短程依赖性。同时,LSTM模块凭借其独特的门控机制,旨在保留特征中基因座之间的长程依赖关系。因此,WheatGP可以从基因组输入中全面提取多级特征。与岭回归最佳线性无偏预测(rrBLUP)、极端梯度提升(XGBoost)、支持向量回归(SVR)和深度神经网络基因组预测(DNNGP)相比,WheatGP在预测准确性方面具有明显优势。小麦产量的预测准确率达到0.73,而各种农艺性状的预测准确率在0.62至0.78之间。它在其他作物类型和多组学数据集上也表现出稳健的性能。此外,采用SHapley加性解释(SHAP)来评估输入对预测模型的贡献。作为一种用于小麦基因组预测的高性能工具,WheatGP为实现高效和优化的小麦育种开辟了新的可能性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1e3/12021598/c564606e99c6/bbaf191f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验