Dagnachew Binyam, Aslam Muhammad Luqman, Hillestad Borghild, Meuwissen Theo, Sonesson Anna
Fisheries and Aquaculture Research, Nofima AS-Norwegian Institute of Food, Tromsø, Norway.
Benchmark Genetics, Bergen, Norway.
Front Genet. 2022 Aug 26;13:896774. doi: 10.3389/fgene.2022.896774. eCollection 2022.
Genomic selection has a great potential in aquaculture breeding since many important traits are not directly measured on the candidates themselves. However, its implementation has been hindered by staggering genotyping costs because of many individual genotypes. In this study, we explored the potential of DNA pooling for creating a reference population as a tool for genomic selection of a binary trait. Two datasets from the SalmoBreed population challenged with salmonid alphavirus, which causes pancreas disease, were used. Dataset-1, that includes 855 individuals (478 survivors and 377 dead), was used to develop four DNA pool samples (i.e., 2 pools each for dead and survival). Dataset-2 includes 914 individuals (435 survivors and 479 dead) belonging to 65 full-sibling families and was used to develop in-silico DNA pools. SNP effects from the pool data were calculated based on allele frequencies estimated from the pools and used to calculate genomic breeding values (GEBVs). The correlation between SNP effects estimated based on individual genotypes and pooled data increased from 0.3 to 0.912 when the number of pools increased from 1 to 200. A similar trend was also observed for the correlation between GEBVs, which increased from 0.84 to 0.976, as the number of pools per phenotype increased from 1 to 200. For dataset-1, the accuracy of prediction was 0.71 and 0.70 when the DNA pools were sequenced in 40× and 20×, respectively, compared to an accuracy of 0.73 for the SNP chip genotypes. For dataset-2, the accuracy of prediction increased from 0.574 to 0.691 when the number of in-silico DNA pools increased from 1 to 200. For this dataset, the accuracy of prediction using individual genotypes was 0.712. A limited effect of sequencing depth on the correlation of GEBVs and prediction accuracy was observed. Results showed that a large number of pools are required to achieve as good prediction as individual genotypes; however, alternative effective pooling strategies should be studied to reduce the number of pools without reducing the prediction power. Nevertheless, it is demonstrated that pooling of a reference population can be used as a tool to optimize between cost and accuracy of selection.
基因组选择在水产养殖育种中具有巨大潜力,因为许多重要性状无法直接在候选个体上进行测量。然而,由于需要对众多个体进行基因分型,其实施受到了高昂基因分型成本的阻碍。在本研究中,我们探索了DNA混合池用于创建参考群体作为二元性状基因组选择工具的潜力。使用了来自SalmoBreed群体的两个数据集,该群体受到鲑鱼α病毒(导致胰腺疾病)的挑战。数据集1包含855个个体(478个存活个体和377个死亡个体),用于构建四个DNA混合池样本(即每个死亡和存活个体各两个混合池)。数据集2包含914个个体(435个存活个体和479个死亡个体),属于65个全同胞家系,用于构建虚拟DNA混合池。基于从混合池中估计的等位基因频率计算混合池数据的SNP效应,并用于计算基因组育种值(GEBV)。当混合池数量从1增加到200时,基于个体基因型估计的SNP效应与混合池数据之间的相关性从0.3增加到0.912。随着每个表型的混合池数量从1增加到200,GEBV之间的相关性也观察到类似趋势,从0.84增加到0.976。对于数据集1,当DNA混合池分别以40×和20×进行测序时,预测准确率分别为0.71和0.70,而SNP芯片基因型的预测准确率为0.73。对于数据集2,当虚拟DNA混合池数量从1增加到200时,预测准确率从0.574增加到0.691。对于该数据集,使用个体基因型的预测准确率为0.712。观察到测序深度对GEBV相关性和预测准确率的影响有限。结果表明,需要大量混合池才能获得与个体基因型一样好的预测效果;然而,应研究替代的有效混合策略,以减少混合池数量而不降低预测能力。尽管如此,已证明参考群体的混合池可作为一种工具,在选择成本和准确性之间进行优化。