Animal Breeding and Genetics Group, Department of Animal Sciences, Georg-August University, 37075 Göttingen, Germany.
Anim Genet. 2010 Aug;41(4):346-56. doi: 10.1111/j.1365-2052.2009.02011.x. Epub 2010 Jan 3.
This study presents a second generation of linkage disequilibrium (LD) map statistics for the whole genome of the Holstein-Friesian population, which has a four times higher resolution compared with that of the maps available so far. We used DNA samples of 810 German Holstein-Friesian cattle genotyped by the Illumina Bovine SNP50K BeadChip to analyse LD structure. A panel of 40 854 (75.6%) markers was included in the final analysis. The pairwise r(2) statistic of SNPs up to 5 Mb apart across the genome was estimated. A mean value of r(2) = 0.30 +/- 0.32 was observed in pairwise distances of <25 kb and it dropped to 0.20 +/- 0.24 at 50-75 kb, which is nearly the average inter-marker space in this study. The proportion of SNPs in useful LD (r(2) > or = 0.25) was 26% for the distance of 50 and 75 kb between SNPs. We found a lower level of LD for SNP pairs at the distance < or =100 kb than previously thought. Analysis revealed 712 haplo-blocks spanning 4.7% of the genome and containing 8.0% of all SNPs. Mean and median block length were estimated as 164 +/- 117 kb and 144 kb respectively. Allele frequencies of the SNPs have a considerable and systematic impact on the estimate of r(2). It is shown that minimizing the allele frequency difference between SNPs reduces the influence of frequency on r(2) estimates. Analysis of past effective population size based on the direct estimates of recombination rates from SNP data showed a decline in effective population size to N(e) = 103 up to approximately 4 generations ago. Systematic effects of marker density and effective population size on observed LD and haplotype structure are discussed.
本研究提出了荷斯坦-弗里森牛全基因组第二代连锁不平衡(LD)图谱统计,其分辨率是迄今为止图谱的四倍。我们使用了 810 头德国荷斯坦-弗里森牛的 DNA 样本,这些样本通过 Illumina Bovine SNP50K BeadChip 进行了基因分型,用于分析 LD 结构。最终分析中包含了 40854 个(75.6%)标记。在整个基因组中,估计了 SNP 之间最大距离为 5Mb 的成对 r(2)统计量。在<25kb 的成对距离中,观察到 r(2)的平均值为 0.30 +/- 0.32,在 50-75kb 时降至 0.20 +/- 0.24,这几乎是本研究中平均标记间的距离。在距离为 50 和 75kb 的 SNP 之间,有用 LD(r(2)>或=0.25)的 SNP 比例为 26%。我们发现 SNP 对之间的 LD 水平低于之前的预期。分析显示,跨越基因组 4.7%的 SNP 有 712 个单倍型块,包含了所有 SNP 的 8.0%。平均和中位数块长度分别估计为 164 +/- 117kb 和 144kb。SNP 的等位基因频率对 r(2)的估计有相当大的系统影响。结果表明,最小化 SNP 之间的等位基因频率差异可以减少频率对 r(2)估计的影响。基于 SNP 数据中重组率的直接估计,对过去有效种群大小的分析表明,大约 4 代以前,有效种群大小下降到 N(e)=103。还讨论了标记密度和有效种群大小对观察到的 LD 和单倍型结构的系统影响。