Umeå Plant Science Centre, Department Forest Genetics and Plant Physiology, Swedish University of Agricultural Sciences, SE-90183, Umeå, Sweden.
Program in Plant Ecology and Evolution, Department of Ecology and Genetics, Evolutionary Biology Centre and SciLifeLab, Uppsala University, Uppsala, Sweden.
Genome Biol. 2021 Jun 13;22(1):179. doi: 10.1186/s13059-021-02392-1.
Genome-wide association studies (GWAS) identify loci underlying the variation of complex traits. One of the main limitations of GWAS is the availability of reliable phenotypic data, particularly for long-lived tree species. Although an extensive amount of phenotypic data already exists in breeding programs, accounting for its high heterogeneity is a great challenge. We combine spatial and factor-analytics analyses to standardize the heterogeneous data from 120 field experiments of 483,424 progenies of Norway spruce to implement the largest reported GWAS for trees using 134 605 SNPs from exome sequencing of 5056 parental trees.
We identify 55 novel quantitative trait loci (QTLs) that are associated with phenotypic variation. The largest number of QTLs is associated with the budburst stage, followed by diameter at breast height, wood quality, and frost damage. Two QTLs with the largest effect have a pleiotropic effect for budburst stage, frost damage, and diameter and are associated with MAP3K genes. Genotype data called from exome capture, recently developed SNP array and gene expression data indirectly support this discovery.
Several important QTLs associated with growth and frost damage have been verified in several southern and northern progeny plantations, indicating that these loci can be used in QTL-assisted genomic selection. Our study also demonstrates that existing heterogeneous phenotypic data from breeding programs, collected over several decades, is an important source for GWAS and that such integration into GWAS should be a major area of inquiry in the future.
全基因组关联研究(GWAS)确定了复杂性状变异的基因座。GWAS 的主要限制之一是可靠表型数据的可用性,特别是对于长寿命的树种。尽管在育种计划中已经存在大量的表型数据,但要解释其高度异质性是一个巨大的挑战。我们结合空间和因子分析来标准化来自 483424 个挪威云杉后代的 120 个田间实验的异质数据,利用 5056 个亲本树木外显子组测序的 134605 个 SNP 实施了迄今为止报道的最大树木 GWAS。
我们鉴定出 55 个与表型变异相关的新数量性状基因座(QTL)。与芽期相关的 QTL 数量最多,其次是胸径、木材质量和霜害。两个具有最大效应的 QTL 对芽期、霜害和直径具有多效性,与 MAP3K 基因相关。最近开发的 SNP 芯片和基因表达数据从外显子捕获中调用的基因型数据间接支持了这一发现。
几个与生长和霜害相关的重要 QTL 在几个南部和北部的后代种植园中得到了验证,表明这些位点可用于 QTL 辅助基因组选择。我们的研究还表明,几十年来收集的来自育种计划的现有异质表型数据是 GWAS 的重要来源,这种整合到 GWAS 中应该是未来的一个主要研究领域。