Feltus F Alex, Wan Jun, Schulze Stefan R, Estill James C, Jiang Ning, Paterson Andrew H
Plant Genome Mapping Laboratory, University of Georgia, Athens, Georgia 30602, USA.
Genome Res. 2004 Sep;14(9):1812-9. doi: 10.1101/gr.2479404.
Dense coverage of the rice genome with polymorphic DNA markers is an invaluable tool for DNA marker-assisted breeding, positional cloning, and a wide range of evolutionary studies. We have aligned drafts of two rice subspecies, indica and japonica, and analyzed levels and patterns of genetic diversity. After filtering multiple copy and low quality sequence, 408,898 candidate DNA polymorphisms (SNPs/INDELs) were discerned between the two subspecies. These filters have the consequence that our data set includes only a subset of the available SNPs (in particular excluding large numbers of SNPs that may occur between repetitive DNA alleles) but increase the likelihood that this subset is useful: Direct sequencing suggests that 79.8% +/- 7.5% of the in silico SNPs are real. The SNP sample in our database is not randomly distributed across the genome. In fact, 566 rice genomic regions had unusually high (328 contigs/48.6 Mb/13.6% of genome) or low (237 contigs/64.7 Mb/18.1% of genome) polymorphism rates. Many SNP-poor regions were substantially longer than most SNP-rich regions, covering up to 4 Mb, and possibly reflecting introgression between the respective gene pools that may have occurred hundreds of years ago. Although 46.2% +/- 8.3% of the SNPs differentiate other pairs of japonica and indica genotypes, SNP rates in rice were not predictive of evolutionary rates for corresponding genes in another grass species, sorghum. The data set is freely available at http://www.plantgenome.uga.edu/snp.
水稻基因组中多态性DNA标记的密集覆盖,是DNA标记辅助育种、图位克隆以及广泛的进化研究中一项极为宝贵的工具。我们已将籼稻和粳稻这两个水稻亚种的草图进行比对,并分析了遗传多样性的水平和模式。在过滤掉多拷贝和低质量序列后,在这两个亚种之间识别出了408,898个候选DNA多态性位点(单核苷酸多态性/插入缺失)。这些过滤措施带来的结果是,我们的数据集仅包含可用单核苷酸多态性的一个子集(尤其排除了大量可能出现在重复DNA等位基因之间的单核苷酸多态性),但增加了这个子集有用的可能性:直接测序表明,计算机模拟的单核苷酸多态性中有79.8%±7.5%是真实的。我们数据库中的单核苷酸多态性样本并非随机分布于整个基因组。事实上,566个水稻基因组区域具有异常高(328个重叠群/48.6兆碱基/占基因组的13.6%)或低(237个重叠群/64.7兆碱基/占基因组的18.1%)的多态性率。许多单核苷酸多态性贫乏区域比大多数单核苷酸多态性丰富区域长得多,覆盖范围达4兆碱基,可能反映了数百年前各基因库之间可能发生的基因渗入。尽管46.2%±8.3%的单核苷酸多态性可区分其他粳稻和籼稻基因型对,但水稻中的单核苷酸多态性率并不能预测另一种禾本科植物高粱中相应基因的进化速率。该数据集可在http://www.plantgenome.uga.edu/snp免费获取。