Department of Biostatistics, University of Washington, Seattle, WA, 98195-1617, USA.
Department of Ecology and Evolution, University of Lausanne, CH-1015, Lausanne, Switzerland.
Heredity (Edinb). 2022 Jan;128(1):1-10. doi: 10.1038/s41437-021-00471-4. Epub 2021 Nov 25.
The two alleles an individual carries at a locus are identical by descent (ibd) if they have descended from a single ancestral allele in a reference population, and the probability of such identity is the inbreeding coefficient of the individual. Inbreeding coefficients can be predicted from pedigrees with founders constituting the reference population, but estimation from genetic data is not possible without data from the reference population. Most inbreeding estimators that make explicit use of sample allele frequencies as estimates of allele probabilities in the reference population are confounded by average kinships with other individuals. This means that the ranking of those estimates depends on the scope of the study sample and we show the variation in rankings for common estimators applied to different subdivisions of 1000 Genomes data. Allele-sharing estimators of within-population inbreeding relative to average kinship in a study sample, however, do have invariant rankings across all studies including those individuals. They are unbiased with a large number of SNPs. We discuss how allele sharing estimates are the relevant quantities for a range of empirical applications.
个体在一个基因座上携带的两个等位基因,如果它们是从参考群体中的一个单一祖先等位基因遗传下来的,那么它们就是同源(ibd)的,并且这种同源的概率就是个体的近交系数。可以从具有构成参考群体的创始人的系谱中预测近交系数,但如果没有参考群体的数据,就不可能从遗传数据中进行估计。大多数明确将样本等位基因频率用作参考群体中等位基因概率估计值的近交系数估计值都受到与其他个体的平均亲缘关系的混淆。这意味着这些估计值的排名取决于研究样本的范围,我们展示了常见估计值应用于 1000 基因组数据不同细分的排名变化。然而,相对于研究样本中平均亲缘关系的群体内近交的等位基因共享估计值在所有研究中都具有不变的排名,包括这些个体。它们在大量 SNP 下是无偏的。我们讨论了等位基因共享估计值在一系列经验应用中的相关数量。