National Key Laboratory of Crop Genetic Improvement, National Center of Plant Gene Research (Wuhan), Huazhong Agricultural University, Wuhan 430070, China.
Proc Natl Acad Sci U S A. 2010 Jun 8;107(23):10578-83. doi: 10.1073/pnas.1005931107. Epub 2010 May 24.
Bar-coded multiplexed sequencing approaches based on new-generation sequencing technologies provide capacity to sequence a mapping population in a single sequencing run. However, such approaches usually generate low-coverage and error-prone sequences for each line in a population. Thus, it is a significant challenge to genotype individual lines in a population for linkage map construction based on low-coverage sequences without the availability of high-quality genotype data of the parental lines. In this paper, we report a method for constructing ultrahigh-density linkage maps composed of high-quality single-nucleotide polymorphisms (SNPs) based on low-coverage sequences of recombinant inbred lines. First, all potential SNPs were identified to obtain drafts of parental genotypes using a maximum parsimonious inference of recombination, making maximum use of SNP information found in the entire population. Second, high-quality SNPs were identified by filtering out low-quality ones by permutations involving resampling of windows of SNPs followed by Bayesian inference. Third, lines in the mapping population were genotyped using the high-quality SNPs assisted by a hidden Markov model. With 0.05x genome sequence per line, an ultrahigh-density linkage map composed of bins of high-quality SNPs using 238 recombinant inbred lines derived from a cross between two rice varieties was constructed. Using this map, a quantitative trait locus for grain width (GW5) was localized to its presumed genomic region in a bin of 200 kb, confirming the accuracy and quality of the map. This method is generally applicable in genetic map construction with low-coverage sequence data.
基于新一代测序技术的条码多重测序方法为在单个测序运行中对作图群体进行测序提供了能力。然而,这种方法通常会为群体中的每条线生成低覆盖度和易错的序列。因此,在没有亲本线高质量基因型数据的情况下,基于低覆盖度序列对群体中的个体线进行连锁图谱构建的基因型分析是一项重大挑战。在本文中,我们报告了一种基于重组自交系低覆盖度序列构建由高质量单核苷酸多态性(SNP)组成的超高密度连锁图谱的方法。首先,通过最大简约重组推断来识别所有潜在的 SNP,以获得亲本基因型的草案,从而充分利用整个群体中发现的 SNP 信息。其次,通过涉及 SNP 窗口重采样的置换来过滤低质量 SNP,并通过贝叶斯推断来识别高质量 SNP。最后,使用隐藏马尔可夫模型辅助图谱群体中的系进行基因型分析。每条线用 0.05x 的基因组序列,使用来自两个水稻品种杂交的 238 个重组自交系,构建了一个由高质量 SNP 组成的超高密度连锁图谱。使用这个图谱,将粒宽(GW5)的数量性状基因座定位到其在 200kb -bin 中的假定基因组区域,证实了图谱的准确性和质量。这种方法通常适用于低覆盖度序列数据的遗传图谱构建。