Animal Breeding and Genomics, Wageningen University & Research, Wageningen, The Netherlands.
J Anim Breed Genet. 2021 Mar;138(2):151-160. doi: 10.1111/jbg.12512. Epub 2020 Oct 10.
For numerically small breeds, obtaining a sufficiently large breed-specific reference population for genomic prediction is challenging or simply not possible, but may be overcome by adding individuals from another breed. To prioritize among available breeds, the effective number of chromosome segments (M ) can be used as an indicator of relatedness between individuals from different breeds. The M is also an important parameter in determining the accuracy of genomic prediction. The M can be estimated both within a population and between two populations or breeds, as the reciprocal of the variance of genomic relationships. However, the threshold for number of individuals needed to accurately estimate within or between populations M is currently unknown. It is also unknown if a discrepancy in number of genotyped individuals in two breeds affects the estimates of M between populations. In this study, we conducted a simulation that mimics current domestic cattle populations in order to investigate how estimated M is affected by number of genotyped individuals, single-nucleotide polymorphism (SNP) density and pedigree availability. Our results show that a small sample of 10 genotyped individuals may result in substantial over or underestimation of M . While estimates of within population M were hardly affected by SNP density, between population M values were highly dependent on the number of available SNPs, with higher SNP densities being able to detect more independent chromosome segments. When subtracting pedigree from genomic relationships before computing M , estimates of within population M were three to four times higher than estimates with genotypes only; however, between M estimates remained the same. For accurate estimation of within and between population M , at least 50 individuals should be genotyped per population. Estimates of within M were highly affected by whether pedigree was used or not. For within M , even the smallest SNP density (~11k) resulted in accurate representation of family relationships in the population; however, for between M , many more markers are needed to capture all independent segments.
对于数量较少的品种,为基因组预测获得足够大的特定品种参考群体具有挑战性,或者根本不可能,但可以通过添加来自另一个品种的个体来克服。为了优先考虑可用品种,可以使用有效染色体片段数 (M) 作为来自不同品种的个体之间亲缘关系的指标。M 也是确定基因组预测准确性的重要参数。可以在一个群体内和两个群体或品种之间估计 M,可以作为基因组关系方差的倒数。然而,目前尚不清楚准确估计群体内或群体间 M 需要多少个体,也不知道两个品种中基因型个体数量的差异是否会影响群体间 M 的估计。在这项研究中,我们进行了一项模拟,模拟当前的家牛种群,以研究估计的 M 如何受到基因型个体数量、单核苷酸多态性 (SNP) 密度和系谱可用性的影响。我们的结果表明,10 个基因型个体的小样本可能导致 M 的估计值出现显著的高估或低估。虽然群体内 M 的估计值几乎不受 SNP 密度的影响,但群体间 M 值高度依赖于可用 SNP 的数量,较高的 SNP 密度能够检测到更多独立的染色体片段。在计算 M 之前从基因组关系中减去系谱时,群体内 M 的估计值比仅使用基因型的估计值高 3 到 4 倍;然而,群体间 M 的估计值保持不变。为了准确估计群体内和群体间 M,每个群体至少应检测 50 个个体。群体内 M 的估计值受是否使用系谱的影响很大。对于群体内 M,即使 SNP 密度最低(~11k),也可以准确表示群体中的家族关系;然而,对于群体间 M,需要更多的标记来捕获所有独立的片段。