Department of Molecular and Computational Biology, University of Southern California, Los Angeles, CA, USA.
Plant J. 2010 Aug;63(4):623-35. doi: 10.1111/j.1365-313X.2010.04267.x.
Genome-wide association studies rely upon segregating natural genetic variation, particularly the patterns of polymorphism and correlation between adjacent markers. To facilitate association studies in the model legume Medicago truncatula, we present a genome-scale polymorphism scan using existing Affymetrix microarrays. We develop and validate a method that uses a simple information-criteria algorithm to call polymorphism from microarray data without reliance on a reference genotype. We genotype 12 inbred M. truncatula lines sampled from four wild Tunisian populations and find polymorphisms at approximately 7% of features, comprising 31 419 probes. Only approximately 3% of these markers assort by population, and of these only 10% differentiate between populations from saline and non-saline sites. Fifty-two differentiated probes with unique genome locations correspond to 18 distinct genome regions. Sanger resequencing was used to characterize a subset of maker loci and develop a single nucleotide polymorphism (SNP)-typing assay that confirmed marker assortment by habitat in an independent sample of 33 individuals from the four populations. Genome-wide linkage disequilibrium (LD) extends on average for approximately 10 kb, falling to background levels by approximately 500 kb. A similar range of LD decay was observed in the 18 genome regions that assort by habitat; these LD blocks delimit candidate genes for local adaptation, many of which encode proteins with predicted functions in abiotic stress tolerance and are targets for functional genomic studies. Tunisian M. truncatula populations contain substantial amounts of genetic variation that is structured in relatively small LD blocks, suggesting a history of migration and recombination. These populations provide a strong resource for genome-wide association studies.
全基因组关联研究依赖于分离自然遗传变异,特别是多态性模式和相邻标记之间的相关性。为了促进模式豆科植物蒺藜苜蓿的关联研究,我们使用现有的 Affymetrix 微阵列进行了全基因组范围的多态性扫描。我们开发并验证了一种方法,该方法使用简单的信息准则算法从微阵列数据中调用多态性,而不依赖参考基因型。我们对来自突尼斯四个野生种群的 12 个近交蒺藜苜蓿系进行了基因分型,发现大约 7%的特征存在多态性,包括 31419 个探针。这些标记中只有大约 3%按种群排列,其中只有 10%可以区分来自盐和非盐生境的种群。与 18 个独特基因组区域相对应的 52 个分化探针具有独特的基因组位置。Sanger 重测序用于对一部分标记基因座进行特征描述,并开发了一种单核苷酸多态性 (SNP) 分型测定法,该测定法在来自四个种群的 33 个个体的独立样本中证实了标记对生境的排列。全基因组连锁不平衡 (LD) 平均扩展约 10kb,到大约 500kb 时降至背景水平。在按生境排列的 18 个基因组区域中观察到类似的 LD 衰减范围;这些 LD 块限定了局部适应的候选基因,其中许多基因编码预测具有非生物胁迫耐受性的蛋白质,并且是功能基因组研究的目标。突尼斯蒺藜苜蓿种群含有大量遗传变异,其结构相对较小的 LD 块,表明存在迁移和重组的历史。这些种群为全基因组关联研究提供了强大的资源。