Agriculture Victoria Research, Agribio, Bundoora, VIC, 3083, Australia.
Agriculture Victoria Research, Ellinbank Centre, Ellinbank, Gippsland, VIC, 3821, Australia.
Genet Sel Evol. 2022 Sep 6;54(1):60. doi: 10.1186/s12711-022-00749-z.
Sharing individual phenotype and genotype data between countries is complex and fraught with potential errors, while sharing summary statistics of genome-wide association studies (GWAS) is relatively straightforward, and thus would be especially useful for traits that are expensive or difficult-to-measure, such as feed efficiency. Here we examined: (1) the sharing of individual cow data from international partners; and (2) the use of sequence variants selected from GWAS of international cow data to evaluate the accuracy of genomic estimated breeding values (GEBV) for residual feed intake (RFI) in Australian cows.
GEBV for RFI were estimated using genomic best linear unbiased prediction (GBLUP) with 50k or high-density single nucleotide polymorphisms (SNPs), from a training population of 3797 individuals in univariate to trivariate analyses where the three traits were RFI phenotypes calculated using 584 Australian lactating cows (AUSc), 824 growing heifers (AUSh), and 2526 international lactating cows (OVE). Accuracies of GEBV in AUSc were evaluated by either cohort-by-birth-year or fourfold random cross-validations. GEBV of AUSc were also predicted using only the AUS training population with a weighted genomic relationship matrix constructed with SNPs from the 50k array and sequence variants selected from a meta-GWAS that included only international datasets. The genomic heritabilities estimated using the AUSc, OVE and AUSh datasets were moderate, ranging from 0.20 to 0.36. The genetic correlations (r) of traits between heifers and cows ranged from 0.30 to 0.95 but were associated with large standard errors. The mean accuracies of GEBV in Australian cows were up to 0.32 and almost doubled when either overseas cows, or both overseas cows and AUS heifers were included in the training population. They also increased when selected sequence variants were combined with 50k SNPs, but with a smaller relative increase.
The accuracy of RFI GEBV increased when international data were used or when selected sequence variants were combined with 50k SNP array data. This suggests that if direct sharing of data is not feasible, a meta-analysis of summary GWAS statistics could provide selected SNPs for custom panels to use in genomic selection programs. However, since this finding is based on a small cross-validation study, confirmation through a larger study is recommended.
在国家间分享个体表型和基因型数据既复杂又容易出错,而分享全基因组关联研究(GWAS)的汇总统计数据则相对简单,因此对于昂贵或难以测量的性状(如饲料效率)特别有用。本研究旨在:(1) 检查国际合作伙伴之间的个体牛数据共享情况;(2) 利用来自国际牛数据 GWAS 选择的序列变异来评估澳大利亚牛的剩余饲料摄入量(RFI)基因组估计育种值(GEBV)的准确性。
使用基因组最佳线性无偏预测(GBLUP),基于单变量到三变量分析中的 3797 个个体的训练群体,使用 584 头澳大利亚泌乳牛(AUSc)、824 头生长小母牛(AUSh)和 2526 头国际泌乳牛(OVE)的 RFI 表型计算的三个性状,分别使用 50k 或高密度单核苷酸多态性(SNP)估计了 RFI 的 GEBV。通过按出生年份或四重随机交叉验证评估了 AUSc 的 GEBV 准确性。仅使用来自 50k 阵列的 SNP 和仅包含国际数据集的荟萃 GWAS 中选择的序列变异构建的加权基因组关系矩阵,也预测了 AUS 训练群体中的 AUSc 的 GEBV。使用 AUSc、OVE 和 AUSh 数据集估计的基因组遗传力中等,范围从 0.20 到 0.36。小母牛和奶牛之间性状的遗传相关性(r)范围从 0.30 到 0.95,但与较大的标准误差相关。澳大利亚奶牛的 GEBV 准确性高达 0.32,当海外奶牛或海外奶牛和 AUS 小母牛都包含在训练群体中时,准确性几乎翻了一番。当将选定的序列变异与 50kSNP 组合使用时,准确性也会增加,但相对增加幅度较小。
当使用国际数据或组合使用选定的序列变异与 50kSNP 阵列数据时,RFI GEBV 的准确性会提高。这表明,如果无法直接共享数据,则汇总 GWAS 统计数据的荟萃分析可以为定制面板提供选定的 SNP,以用于基因组选择计划。然而,由于这一发现基于一个小的交叉验证研究,建议通过更大的研究进行确认。