Szatkiewicz Jin P, Beane Glen L, Ding Yueming, Hutchins Lucie, Pardo-Manuel de Villena Fernando, Churchill Gary A
The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine 04609, USA.
Mamm Genome. 2008 Mar;19(3):199-208. doi: 10.1007/s00335-008-9098-9. Epub 2008 Feb 27.
We have created a high-density SNP resource encompassing 7.87 million polymorphic loci across 49 inbred mouse strains of the laboratory mouse by combining data available from public databases and training a hidden Markov model to impute missing genotypes in the combined data. The strong linkage disequilibrium found in dense sets of SNP markers in the laboratory mouse provides the basis for accurate imputation. Using genotypes from eight independent SNP resources, we empirically validated the quality of the imputed genotypes and demonstrated that they are highly reliable for most inbred strains. The imputed SNP resource will be useful for studies of natural variation and complex traits. It will facilitate association study designs by providing high-density SNP genotypes for large numbers of mouse strains. We anticipate that this resource will continue to evolve as new genotype data become available for laboratory mouse strains. The data are available for bulk download or query at http://cgd.jax.org /.
我们通过整合公共数据库中的可用数据,并训练一个隐马尔可夫模型来推算合并数据中缺失的基因型,创建了一个高密度单核苷酸多态性(SNP)资源,该资源涵盖了49个近交系实验室小鼠品系中的787万个多态性位点。在实验室小鼠的密集SNP标记集中发现的强连锁不平衡为准确推算提供了基础。利用来自八个独立SNP资源的基因型,我们通过实验验证了推算基因型的质量,并证明它们对大多数近交系而言高度可靠。推算得到的SNP资源将有助于自然变异和复杂性状的研究。它将通过为大量小鼠品系提供高密度SNP基因型来促进关联研究设计。我们预计,随着实验室小鼠品系有了新的基因型数据,这个资源将持续发展。这些数据可在http://cgd.jax.org/进行批量下载或查询。