Ballén-Taborda Carolina, Lyerly Jeanette, Smith Jared, Howell Kimberly, Brown-Guedira Gina, Babar Md Ali, Harrison Stephen A, Mason Richard E, Mergoum Mohamed, Murphy J Paul, Sutton Russell, Griffey Carl A, Boyles Richard E
Department of Plant and Environmental Sciences, Clemson University, Clemson, SC, United States.
Pee Dee Research and Education Center, Clemson University, Florence, SC, United States.
Front Genet. 2022 Oct 7;13:964684. doi: 10.3389/fgene.2022.964684. eCollection 2022.
With the rapid generation and preservation of both genomic and phenotypic information for many genotypes within crops and across locations, emerging breeding programs have a valuable opportunity to leverage these resources to 1) establish the most appropriate genetic foundation at program inception and 2) implement robust genomic prediction platforms that can effectively select future breeding lines. Integrating genomics-enabled breeding into cultivar development can save costs and allow resources to be reallocated towards advanced (i.e., later) stages of field evaluation, which can facilitate an increased number of testing locations and replicates within locations. In this context, a reestablished winter wheat breeding program was used as a case study to understand best practices to leverage and tailor existing genomic and phenotypic resources to determine optimal genetics for a specific target population of environments. First, historical multi-environment phenotype data, representing 1,285 advanced breeding lines, were compiled from multi-institutional testing as part of the SunGrains cooperative and used to produce GGE biplots and PCA for yield. Locations were clustered based on highly correlated line performance among the target population of environments into 22 subsets. For each of the subsets generated, EMMs and BLUPs were calculated using linear models with the R package. Second, for each subset, TPs representative of the new SC breeding lines were determined based on genetic relatedness using the R package. Third, for each TP, phenotypic values and SNP data were incorporated into the mixed models for generation of GEBVs of YLD, TW, HD and PH. Using a five-fold cross-validation strategy, an average accuracy of = 0.42 was obtained for yield between all TPs. The validation performed with 58 SC elite breeding lines resulted in an accuracy of = 0.62 when the TP included complete historical data. Lastly, QTL-by-environment interaction for 18 major effect genes across three geographic regions was examined. Lines harboring major QTL in the absence of disease could potentially underperform (e.g., Fhb1 R-gene), whereas it is advantageous to express a major QTL under biotic pressure (e.g., stripe rust R-gene). This study highlights the importance of genomics-enabled breeding and multi-institutional partnerships to accelerate cultivar development.
随着作物中许多基因型以及不同地点的基因组和表型信息的快速生成与保存,新兴的育种计划有宝贵的机会利用这些资源来:1)在计划启动时建立最合适的遗传基础;2)实施强大的基因组预测平台,从而能够有效地选择未来的育种品系。将基于基因组学的育种整合到品种开发中可以节省成本,并使资源能够重新分配到田间评估的后期(即更高级)阶段,这有助于增加测试地点数量以及每个地点内的重复次数。在此背景下,一个重新建立的冬小麦育种计划被用作案例研究,以了解利用和调整现有基因组和表型资源的最佳实践,从而为特定目标环境群体确定最优基因。首先,作为SunGrains合作项目的一部分,从多机构测试中收集了代表1285个高级育种品系的历史多环境表型数据,并用于生成产量的GGE双标图和主成分分析。根据目标环境群体中高度相关的品系表现,将地点聚类为22个子集。对于生成的每个子集,使用R包中的线性模型计算EMM和BLUP。其次,对于每个子集,使用R包根据遗传相关性确定代表新的SC育种品系的TP。第三,对于每个TP,将表型值和SNP数据纳入混合模型,以生成产量、粒重、抽穗期和株高的基因组估计育种值。采用五倍交叉验证策略,所有TP之间产量的平均准确率为 = 0.42。当TP包含完整的历史数据时,对58个SC精英育种品系进行验证得到的准确率为 =