Center for Human Genetics Research, Massachusetts General Hospital, Boston, Massachusetts 02114, USA.
Genetics. 2010 Jul;185(3):1081-95. doi: 10.1534/genetics.110.115014. Epub 2010 May 3.
The genetics of phenotypic variation in inbred mice has for nearly a century provided a primary weapon in the medical research arsenal. A catalog of the genetic variation among inbred mouse strains, however, is required to enable powerful positional cloning and association techniques. A recent whole-genome resequencing study of 15 inbred mouse strains captured a significant fraction of the genetic variation among a limited number of strains, yet the common use of hundreds of inbred strains in medical research motivates the need for a high-density variation map of a larger set of strains. Here we report a dense set of genotypes from 94 inbred mouse strains containing 10.77 million genotypes over 121,433 single nucleotide polymorphisms (SNPs), dispersed at 20-kb intervals on average across the genome, with an average concordance of 99.94% with previous SNP sets. Through pairwise comparisons of the strains, we identified an average of 4.70 distinct segments over 73 classical inbred strains in each region of the genome, suggesting limited genetic diversity between the strains. Combining these data with genotypes of 7570 gap-filling SNPs, we further imputed the untyped or missing genotypes of 94 strains over 8.27 million Perlegen SNPs. The imputation accuracy among classical inbred strains is estimated at 99.7% for the genotypes imputed with high confidence. We demonstrated the utility of these data in high-resolution linkage mapping through power simulations and statistical power analysis and provide guidelines for developing such studies. We also provide a resource of in silico association mapping between the complex traits deposited in the Mouse Phenome Database with our genotypes. We expect that these resources will facilitate effective designs of both human and mouse studies for dissecting the genetic basis of complex traits.
近一个世纪以来,近交系小鼠的表型变异遗传学一直是医学研究武器库的主要武器。然而,为了能够使用强大的定位克隆和关联技术,需要有一个近交系小鼠品系之间遗传变异的目录。最近对 15 个近交系小鼠品系进行的全基因组重测序研究捕获了少数几个品系之间遗传变异的显著部分,但由于数百个近交系在医学研究中的广泛应用,需要对更大数量的品系进行高密度变异图谱绘制。在这里,我们报告了 94 个近交系小鼠品系的一组密集基因型,包含 1077 万个基因型,分布在 121433 个单核苷酸多态性(SNP)上,平均分布在基因组上 20kb 的间隔,平均与以前的 SNP 集的一致性为 99.94%。通过对这些品系进行两两比较,我们在基因组的每个区域发现了 73 个经典近交系中的 4.70 个不同的片段,表明这些品系之间的遗传多样性有限。将这些数据与 7570 个填补间隙 SNP 的基因型进行组合,我们进一步推断了 94 个品系中 827 万 Perlegen SNP 中未分型或缺失的基因型。在经典近交系中,对于高可信度推断的基因型,估计其推断准确性为 99.7%。我们通过功率模拟和统计功率分析展示了这些数据在高分辨率连锁映射中的应用,并提供了开展此类研究的指导原则。我们还提供了一个在复杂性状与我们的基因型之间进行复杂关联映射的资源。我们希望这些资源将有助于有效地设计人类和小鼠研究,以剖析复杂性状的遗传基础。