Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA.
Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA.
Nucleic Acids Res. 2021 Jul 2;49(W1):W228-W236. doi: 10.1093/nar/gkab407.
G2PDeep is an open-access web server, which provides a deep-learning framework for quantitative phenotype prediction and discovery of genomics markers. It uses zygosity or single nucleotide polymorphism (SNP) information from plants and animals as the input to predict quantitative phenotype of interest and genomic markers associated with phenotype. It provides a one-stop-shop platform for researchers to create deep-learning models through an interactive web interface and train these models with uploaded data, using high-performance computing resources plugged at the backend. G2PDeep also provides a series of informative interfaces to monitor the training process and compare the performance among the trained models. The trained models can then be deployed automatically. The quantitative phenotype and genomic markers are predicted using a user-selected trained model and the results are visualized. Our state-of-the-art model has been benchmarked and demonstrated competitive performance in quantitative phenotype predictions by other researchers. In addition, the server integrates the soybean nested association mapping (SoyNAM) dataset with five phenotypes, including grain yield, height, moisture, oil, and protein. A publicly available dataset for seed protein and oil content has also been integrated into the server. The G2PDeep server is publicly available at http://g2pdeep.org. The Python-based deep-learning model is available at https://github.com/shuaizengMU/G2PDeep_model.
G2PDeep 是一个开放获取的网络服务器,它为定量表型预测和基因组标记物的发现提供了一个深度学习框架。它使用植物和动物的同卵性或单核苷酸多态性(SNP)信息作为输入,来预测感兴趣的定量表型和与表型相关的基因组标记物。它为研究人员提供了一个一站式平台,通过交互式网页界面创建深度学习模型,并使用后端插入的高性能计算资源来训练这些模型。G2PDeep 还提供了一系列有用的接口,用于监控训练过程并比较训练模型之间的性能。然后可以自动部署训练好的模型。使用用户选择的训练模型预测定量表型和基因组标记物,并可视化结果。我们的最先进的模型已经经过基准测试,并在其他研究人员的定量表型预测中展示了有竞争力的性能。此外,该服务器还集成了大豆嵌套关联作图(SoyNAM)数据集,其中包含五个表型,包括籽粒产量、株高、水分、油分和蛋白质。还集成了一个公开可用的种子蛋白质和油分含量数据集。G2PDeep 服务器可在 http://g2pdeep.org 上访问。基于 Python 的深度学习模型可在 https://github.com/shuaizengMU/G2PDeep_model 上获取。