El Hanafi Samira, Jiang Yong, Kehel Zakaria, Schulthess Albert W, Zhao Yusheng, Mascher Martin, Haupt Max, Himmelbach Axel, Stein Nils, Amri Ahmed, Reif Jochen C
Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany.
International Center for Agricultural Research in Dry Areas (ICARDA), Rabat, Morocco.
Front Plant Sci. 2023 Aug 28;14:1227656. doi: 10.3389/fpls.2023.1227656. eCollection 2023.
Genome-wide prediction is a powerful tool in breeding. Initial results suggest that genome-wide approaches are also promising for enhancing the use of the genebank material: predicting the performance of plant genetic resources can unlock their hidden potential and fill the information gap in genebanks across the world and, hence, underpin prebreeding programs. As a proof of concept, we evaluated the power of across-genebank prediction for extensive germplasm collections relying on historical data on flowering/heading date, plant height, and thousand kernel weight of 9,344 barley ( L.) plant genetic resources from the German Federal Ex situ Genebank for Agricultural and Horticultural Crops (IPK) and of 1,089 accessions from the International Center for Agriculture Research in the Dry Areas (ICARDA) genebank. Based on prediction abilities for each trait, three scenarios for predictive characterization were compared: 1) a benchmark scenario, where test and training sets only contain ICARDA accessions, 2) across-genebank predictions using IPK as training and ICARDA as test set, and 3) integrated genebank predictions that include IPK with 30% of ICARDA accessions as a training set to predict the rest of ICARDA accessions. Within the population of ICARDA accessions, prediction abilities were low to moderate, which was presumably caused by a limited number of accessions used to train the model. Interestingly, ICARDA prediction abilities were boosted up to ninefold by using training sets composed of IPK plus 30% of ICARDA accessions. Pervasive genotype × environment interactions (GEIs) can become a potential obstacle to train robust genome-wide prediction models across genebanks. This suggests that the potential adverse effect of GEI on prediction ability was counterbalanced by the augmented training set with certain connectivity to the test set. Therefore, across-genebank predictions hold the promise to improve the curation of the world's genebank collections and contribute significantly to the long-term development of traditional genebanks toward biodigital resource centers.
全基因组预测是育种中的一项强大工具。初步结果表明,全基因组方法在加强基因库材料利用方面也很有前景:预测植物遗传资源的表现可以挖掘其隐藏潜力,填补全球基因库的信息空白,从而为育种前计划提供支持。作为概念验证,我们评估了基于德国联邦农业和园艺作物异地基因库(IPK)中9344份大麦植物遗传资源以及国际干旱地区农业研究中心(ICARDA)基因库中1089份种质的开花/抽穗期、株高和千粒重历史数据,对广泛种质资源进行跨基因库预测的能力。基于各性状的预测能力,比较了三种预测性表征方案:1)基准方案,测试集和训练集仅包含ICARDA种质;2)以IPK为训练集、ICARDA为测试集的跨基因库预测;3)整合基因库预测,即将IPK与30%的ICARDA种质作为训练集来预测其余的ICARDA种质。在ICARDA种质群体中,预测能力较低至中等,这可能是由于用于训练模型的种质数量有限。有趣的是,通过使用由IPK加上30%的ICARDA种质组成的训练集,ICARDA的预测能力提高了九倍。普遍存在的基因型×环境互作(GEIs)可能成为跨基因库训练稳健全基因组预测模型的潜在障碍。这表明,GEI对预测能力的潜在不利影响被与测试集有一定关联性的扩充训练集所抵消。因此,跨基因库预测有望改善全球基因库藏品的管理,并为传统基因库向生物数字资源中心的长期发展做出重大贡献。