Baruch Eyal, Weller Joel Ira, Cohen-Zinder Miri, Ron Micha, Seroussi Eyal
Institute of Animal Sciences, ARO, The Volcani Center, Bet Dagan 50250, Israel.
Genetics. 2006 Mar;172(3):1757-65. doi: 10.1534/genetics.105.047134. Epub 2005 Dec 15.
We present a simple algorithm for reconstruction of haplotypes from a sample of multilocus genotypes. The algorithm is aimed specifically for analysis of very large pedigrees for small chromosomal segments, where recombination frequency within the chromosomal segment can be assumed to be zero. The algorithm was tested both on simulated pedigrees of 155 individuals in a family structure of three generations and on real data of 1149 animals from the Israeli Holstein dairy cattle population, including 406 bulls with genotypes, but no females with genotypes. The rate of haplotype resolution for the simulated data was >91% with a standard deviation of 2%. With 20% missing data, the rate of haplotype resolution was 67.5% with a standard deviation of 1.3%. In both cases all recovered haplotypes were correct. In the real data, allele origin was resolved for 22% of the heterozygous genotypes, even though 70% of the genotypes were missing. Haplotypes were resolved for 36% of the males. Computing time was insignificant for both data sets. Despite the intricacy of large-scale real pedigree genotypes, the proposed algorithm provides a practical rule-based solution for resolving haplotypes for small chromosomal segments in commercial animal populations.
我们提出了一种从多位点基因型样本中重建单倍型的简单算法。该算法专门用于分析小型染色体片段的非常大的家系,其中可假定染色体片段内的重组频率为零。该算法在一个三代家庭结构中的155个个体的模拟家系以及来自以色列荷斯坦奶牛种群的1149只动物的真实数据上进行了测试,其中包括406只有基因型的公牛,但没有有基因型的母牛。模拟数据的单倍型解析率>91%,标准差为2%。当有20%的数据缺失时,单倍型解析率为67.5%,标准差为1.3%。在这两种情况下,所有恢复的单倍型都是正确的。在真实数据中,即使70%的基因型缺失,22%的杂合基因型的等位基因来源也得到了解析。36%的雄性个体的单倍型得到了解析。对于这两个数据集来说,计算时间都可以忽略不计。尽管大规模真实家系基因型很复杂,但所提出的算法为解析商业动物种群中小型染色体片段的单倍型提供了一种基于规则的实用解决方案。