Departamento de Mejora Genética Animal, Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA), 28040 Madrid, Spain.
Genetics. 2012 May;191(1):195-213. doi: 10.1534/genetics.111.137521. Epub 2012 Feb 29.
Maximum likelihood methods for the estimation of linkage disequilibrium between biallelic DNA-markers in half-sib families (half-sib method) are developed for single and multifamily situations. Monte Carlo computer simulations were carried out for a variety of scenarios regarding sire genotypes, linkage disequilibrium, recombination fraction, family size, and number of families. A double heterozygote sire was simulated with recombination fraction of 0.00, linkage disequilibrium among dams of δ=0.10, and alleles at both markers segregating at intermediate frequencies for a family size of 500. The average estimates of δ were 0.17, 0.25, and 0.10 for Excoffier and Slatkin (1995), maternal informative haplotypes, and the half-sib method, respectively. A multifamily EM algorithm was tested at intermediate frequencies by computer simulation. The range of the absolute difference between estimated and simulated δ was between 0.000 and 0.008. A cattle half-sib family was genotyped with the Illumina 50K BeadChip. There were 314,730 SNP pairs for which the sire was a homo-heterozygote with average estimates of r2 of 0.115, 0.067, and 0.111 for half-sib, Excoffier and Slatkin (1995), and maternal informative haplotypes methods, respectively. There were 208,872 SNP pairs for which the sire was double heterozygote with average estimates of r2 across the genome of 0.100, 0.267, and 0.925 for half-sib, Excoffier and Slatkin (1995), and maternal informative haplotypes methods, respectively. Genome analyses for all possible sire genotypes with 829,042 tests showed that ignoring half-sib family structure leads to upward biased estimates of linkage disequilibrium. Published inferences on population structure and evolution of cattle should be revisited after accommodating existing half-sib family structure in the estimation of linkage disequilibrium.
开发了用于半同胞家庭(半同胞法)中双等位 DNA 标记连锁不平衡估计的最大似然方法,适用于单家和多家庭情况。针对父本基因型、连锁不平衡、重组分数、家庭规模和家庭数量的各种情况进行了蒙特卡罗计算机模拟。模拟了一个重组分数为 0.00、母本连锁不平衡为 δ=0.10、两个标记的等位基因在中等频率下分离的双杂合父本,家庭规模为 500。对于 Excoffier 和 Slatkin(1995 年)、母体信息单倍型和半同胞方法,δ 的平均估计值分别为 0.17、0.25 和 0.10。通过计算机模拟测试了多家庭 EM 算法在中等频率下的性能。估计值与模拟值之间的绝对差异范围在 0.000 到 0.008 之间。使用 Illumina 50K BeadChip 对牛半同胞家系进行了基因分型。有 314,730 对 SNP 对,其中父本是同型杂合子,半同胞、Excoffier 和 Slatkin(1995 年)和母体信息单倍型方法的 r2 平均估计值分别为 0.115、0.067 和 0.111。有 208,872 对 SNP 对,其中父本是双杂合子,全基因组 r2 的平均估计值分别为 0.100、0.267 和 0.925,用于半同胞、Excoffier 和 Slatkin(1995 年)和母体信息单倍型方法。对所有可能的父本基因型进行了 829,042 次测试的基因组分析表明,忽略半同胞家系结构会导致连锁不平衡的高估。在估计连锁不平衡时,应考虑到现有的半同胞家系结构,重新审视关于牛种群结构和进化的已有推断。