Aalborg Trine, Sverrisdóttir Elsa, Kristensen Heidi Thorgaard, Nielsen Kåre Lehmann
Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark.
Front Plant Sci. 2024 Mar 8;15:1340189. doi: 10.3389/fpls.2024.1340189. eCollection 2024.
Genomic prediction and genome-wide association studies are becoming widely employed in potato key performance trait QTL identifications and to support potato breeding using genomic selection. Elite cultivars are tetraploid and highly heterozygous but also share many common ancestors and generation-spanning inbreeding events, resulting from the clonal propagation of potatoes through seed potatoes. Consequentially, many SNP markers are not in a 1:1 relationship with a single allele variant but shared over several alleles that might exert varying effects on a given trait. The impact of such redundant "diluted" predictors on the statistical models underpinning genome-wide association studies (GWAS) and genomic prediction has scarcely been evaluated despite the potential impact on model accuracy and performance. We evaluated the impact of marker location, marker type, and marker density on the genomic prediction and GWAS of five key performance traits in tetraploid potato (chipping quality, dry matter content, length/width ratio, senescence, and yield). A 762-offspring panel of a diallel cross of 18 elite cultivars was genotyped by sequencing, and markers were annotated according to a reference genome. Genomic prediction models (GBLUP) were trained on four marker subsets [non-synonymous (29,553 SNPs), synonymous (31,229), non-coding (32,388), and a combination], and robustness to marker reduction was investigated. Single-marker regression GWAS was performed for each trait and marker subset. The best cross-validated prediction correlation coefficients of 0.54, 0.75, 0.49, 0.35, and 0.28 were obtained for chipping quality, dry matter content, length/width ratio, senescence, and yield, respectively. The trait prediction abilities were similar across all marker types, with only non-synonymous variants improving yield predictive ability by 16%. Marker reduction response did not depend on marker type but rather on trait. Traits with high predictive abilities, e.g., dry matter content, reached a plateau using fewer markers than traits with intermediate-low correlations, such as yield. The predictions were unbiased across all traits, marker types, and all marker densities >100 SNPs. Our results suggest that using non-synonymous variants does not enhance the performance of genomic prediction of most traits. The major known QTLs were identified by GWAS and were reproducible across exonic and whole-genome variant sets for dry matter content, length/width ratio, and senescence. In contrast, minor QTL detection was marker type dependent.
基因组预测和全基因组关联研究正广泛应用于马铃薯关键性能性状的数量性状位点(QTL)鉴定,并支持利用基因组选择进行马铃薯育种。优良品种是四倍体且高度杂合,但也有许多共同祖先以及跨越世代的近亲繁殖事件,这是由于马铃薯通过种薯进行无性繁殖导致的。因此,许多单核苷酸多态性(SNP)标记并非与单个等位基因变体呈1:1关系,而是多个等位基因共享,这些等位基因可能对给定性状产生不同影响。尽管这种冗余的“稀释”预测因子可能对全基因组关联研究(GWAS)和基因组预测所基于的统计模型的准确性和性能产生潜在影响,但几乎未对其进行评估。我们评估了标记位置、标记类型和标记密度对四倍体马铃薯五个关键性能性状(薯片品质、干物质含量、长宽比、衰老和产量)的基因组预测和GWAS的影响。对18个优良品种双列杂交的762个后代群体进行测序基因分型,并根据参考基因组对标记进行注释。在四个标记子集上训练基因组预测模型(GBLUP)[非同义(29,553个SNP)、同义(31,229个)、非编码(32,388个)以及一个组合],并研究对标记减少的稳健性。对每个性状和标记子集进行单标记回归GWAS。薯片品质、干物质含量、长宽比、衰老和产量的最佳交叉验证预测相关系数分别为0.54、0.75、0.49、0.35和0.28。所有标记类型的性状预测能力相似,只有非同义变体使产量预测能力提高了16%。标记减少响应不取决于标记类型,而是取决于性状。预测能力高的性状,如干物质含量,使用比中等低相关性性状(如产量)更少的标记就能达到平稳状态。所有性状、标记类型和所有标记密度>100个SNP时预测均无偏差。我们的结果表明,使用非同义变体并不能提高大多数性状的基因组预测性能。通过GWAS鉴定出了主要的已知QTL,并且在干物质含量、长宽比和衰老的外显子和全基因组变体集之间具有可重复性。相比之下,次要QTL检测取决于标记类型。