Department of Plant Sciences, North Dakota State University, Fargo, ND, USA.
Department of Genetics and Plant Breeding, Bangladesh Agricultural University, Mymensingh, 2202, Bangladesh.
Sci Rep. 2024 Feb 8;14(1):3196. doi: 10.1038/s41598-024-53462-w.
Breeding programs require exhaustive phenotyping of germplasms, which is time-demanding and expensive. Genomic prediction helps breeders harness the diversity of any collection to bypass phenotyping. Here, we examined the genomic prediction's potential for seed yield and nine agronomic traits using 26,171 single nucleotide polymorphism (SNP) markers in a set of 337 flax (Linum usitatissimum L.) germplasm, phenotyped in five environments. We evaluated 14 prediction models and several factors affecting predictive ability based on cross-validation schemes. Models yielded significant variation among predictive ability values across traits for the whole marker set. The ridge regression (RR) model covering additive gene action yielded better predictive ability for most of the traits, whereas it was higher for low heritable traits by models capturing epistatic gene action. Marker subsets based on linkage disequilibrium decay distance gave significantly higher predictive abilities to the whole marker set, but for randomly selected markers, it reached a plateau above 3000 markers. Markers having significant association with traits improved predictive abilities compared to the whole marker set when marker selection was made on the whole population instead of the training set indicating a clear overfitting. The correction for population structure did not increase predictive abilities compared to the whole collection. However, stratified sampling by picking representative genotypes from each cluster improved predictive abilities. The indirect predictive ability for a trait was proportionate to its correlation with other traits. These results will help breeders to select the best models, optimum marker set, and suitable genotype set to perform an indirect selection for quantitative traits in this diverse flax germplasm collection.
育种计划需要对种质资源进行详尽的表型分析,这既耗时又昂贵。基因组预测有助于育种者利用任何群体的多样性来绕过表型分析。在这里,我们使用一组 337 份亚麻(Linum usitatissimum L.)种质资源中的 26,171 个单核苷酸多态性(SNP)标记,在五个环境中对其进行表型分析,研究了基因组预测对种子产量和九个农艺性状的潜力。我们根据交叉验证方案评估了 14 种预测模型和几种影响预测能力的因素。模型在整个标记集的各性状预测能力值之间产生了显著的差异。涵盖加性基因作用的岭回归(RR)模型对大多数性状的预测能力更高,而对于具有低遗传力的性状,捕捉上位基因作用的模型则更高。基于连锁不平衡衰减距离的标记子集为整个标记集提供了显著更高的预测能力,但对于随机选择的标记,在超过 3000 个标记后,其预测能力达到了一个平台期。与整个标记集相比,与性状具有显著关联的标记在选择标记时基于整个群体而不是训练集时,提高了预测能力,这表明存在明显的过拟合现象。与整个群体相比,校正群体结构并没有提高预测能力。然而,从每个聚类中选择代表性基因型进行分层抽样可以提高预测能力。性状的间接预测能力与其与其他性状的相关性成正比。这些结果将帮助育种者选择最佳模型、最优标记集和适合的基因型集,以在这个多样化的亚麻种质资源群体中对数量性状进行间接选择。