Canada Research Chair in Forest and Environmental Genomics, Centre for Forest Research and Institute for Systems and Integrative Biology, Université Laval, Québec, Canada.
Genome Biol Evol. 2013;5(10):1910-25. doi: 10.1093/gbe/evt143.
Gene families differ in composition, expression, and chromosomal organization between conifers and angiosperms, but little is known regarding nucleotide polymorphism. Using various sequencing strategies, an atlas of 212k high-confidence single nucleotide polymorphisms (SNPs) with a validation rate of more than 92% was developed for the conifer white spruce (Picea glauca). Nonsynonymous and synonymous SNPs were annotated over the corresponding 13,498 white spruce genes representative of 2,457 known gene families. Patterns of nucleotide polymorphisms were analyzed by estimating the ratio of nonsynonymous to synonymous numbers of substitutions per site (A/S). A general excess of synonymous SNPs was expected and observed. However, the analysis from several perspectives enabled to identify groups of genes harboring an excess of nonsynonymous SNPs, thus potentially under positive selection. Four known gene families harbored such an excess: dehydrins, ankyrin-repeats, AP2/DREB, and leucine-rich repeat. Conifer-specific sequences were also generally associated with the highest A/S ratios. A/S values were also distributed asymmetrically across genes specifically expressed in megagametophytes, roots, or in both, harboring on average an excess of nonsynonymous SNPs. These patterns confirm that the breadth of gene expression is a contributing factor to the evolution of nucleotide polymorphism. The A/S ratios of Medicago truncatula genes were also analyzed: several gene families shared between P. glauca and M. truncatula data sets had similar excess of synonymous or nonsynonymous SNPs. However, a number of families with high A/S ratios were found specific to P. glauca, suggesting cases of divergent evolution at the functional level.
基因家族在组成、表达和染色体组织方面在针叶树和被子植物之间存在差异,但关于核苷酸多态性的了解甚少。使用各种测序策略,为针叶树白云杉(Picea glauca)开发了一个包含 212k 个高可信度单核苷酸多态性(SNP)的图谱,验证率超过 92%。在代表 2457 个已知基因家族的 13498 个白云杉基因中,注释了非同义 SNP 和同义 SNP。通过估计每个位点非同义替换与同义替换的数量比(A/S)来分析核苷酸多态性模式。预计并观察到同义 SNP 普遍过剩。然而,从多个角度进行的分析能够识别出含有过量非同义 SNP 的基因群,因此可能受到正选择的影响。四个已知的基因家族具有这种过剩:脱水素、锚蛋白重复、AP2/DREB 和富含亮氨酸重复。针叶树特有的序列通常也与最高的 A/S 比值相关。A/S 值在专门在大配子体、根或两者中表达的基因中也呈不对称分布,平均含有过量的非同义 SNP。这些模式证实了基因表达的广度是核苷酸多态性进化的一个促成因素。还分析了 Medicago truncatula 基因的 A/S 比值:在 P. glauca 和 M. truncatula 数据集之间共享的几个基因家族具有相似的同义或非同义 SNP 过剩。然而,发现了一些具有高 A/S 比值的家族是 P. glauca 特有的,这表明在功能水平上存在分歧进化的情况。