Eck Sebastian H, Benet-Pagès Anna, Flisikowski Krzysztof, Meitinger Thomas, Fries Ruedi, Strom Tim M
Institute of Human Genetics, Helmholtz Zentrum München, German Research Center for Environmental Health, Ingolstädter Landstr, 85764 Neuherberg, Germany.
Genome Biol. 2009;10(8):R82. doi: 10.1186/gb-2009-10-8-r82. Epub 2009 Aug 6.
The majority of the 2 million bovine single nucleotide polymorphisms (SNPs) currently available in dbSNP have been identified in a single breed, Hereford cattle, during the bovine genome project. In an attempt to evaluate the variance of a second breed, we have produced a whole genome sequence at low coverage of a single Fleckvieh bull.
We generated 24 gigabases of sequence, mainly using 36-bp paired-end reads, resulting in an average 7.4-fold sequence depth. This coverage was sufficient to identify 2.44 million SNPs, 82% of which were previously unknown, and 115,000 small indels. A comparison with the genotypes of the same animal, generated on a 50 k oligonucleotide chip, revealed a detection rate of 74% and 30% for homozygous and heterozygous SNPs, respectively. The false positive rate, as determined by comparison with genotypes determined for 196 randomly selected SNPs, was approximately 1.1%. We further determined the allele frequencies of the 196 SNPs in 48 Fleckvieh and 48 Braunvieh bulls. 95% of the SNPs were polymorphic with an average minor allele frequency of 24.5% and with 83% of the SNPs having a minor allele frequency larger than 5%.
This work provides the first single cattle genome by next-generation sequencing. The chosen approach - low to medium coverage re-sequencing - added more than 2 million novel SNPs to the currently publicly available SNP resource, providing a valuable resource for the construction of high density oligonucleotide arrays in the context of genome-wide association studies.
目前dbSNP中可用的200万个牛单核苷酸多态性(SNP)大部分是在牛基因组计划期间在单一品种赫里福德牛中鉴定出来的。为了评估第二个品种的变异情况,我们对一头西门塔尔公牛进行了低覆盖度的全基因组测序。
我们生成了240亿碱基的序列,主要使用36碱基对的双末端测序读段,平均测序深度为7.4倍。这种覆盖度足以鉴定出244万个SNP,其中82%是此前未知的,以及11.5万个小插入缺失。与在50k寡核苷酸芯片上生成的同一动物的基因型进行比较,发现纯合SNP和杂合SNP的检测率分别为74%和30%。通过与196个随机选择的SNP所确定的基因型进行比较,确定的假阳性率约为1.1%。我们进一步确定了这196个SNP在48头西门塔尔公牛和48头德国黄牛公牛中的等位基因频率。95%的SNP具有多态性,平均次要等位基因频率为24.5%,83%的SNP次要等位基因频率大于5%。
这项工作通过下一代测序提供了首个单头牛基因组。所采用的方法——低至中等覆盖度重测序——为当前公开可用的SNP资源增加了200多万个新的SNP,为全基因组关联研究背景下高密度寡核苷酸阵列的构建提供了宝贵资源。