Department of Biology, West Virginia University, Morgantown, West Virginia 26506, USA.
Genome Res. 2012 Jan;22(1):95-105. doi: 10.1101/gr.125146.111. Epub 2011 Oct 5.
Comparative analysis of multiple angiosperm genomes has implicated gene duplication in the expansion and diversification of many gene families. However, empirical data and theory suggest that whole-genome and small-scale duplication events differ with respect to the types of genes preserved as duplicate pairs. We compared gene duplicates resulting from a recent whole genome duplication to a set of tandemly duplicated genes in the model forest tree Populus trichocarpa. We used a combination of microarray expression analyses of a diverse set of tissues and functional annotation to assess factors related to the preservation of duplicate genes of both types. Whole genome duplicates are 700 bp longer and are expressed in 20% more tissues than tandem duplicates. Furthermore, certain functional categories are over-represented in each class of duplicates. In particular, disease resistance genes and receptor-like kinases commonly occur in tandem but are significantly under-retained following whole genome duplication, while whole genome duplicate pairs are enriched for members of signal transduction cascades and transcription factors. The shape of the distribution of expression divergence for duplicated pairs suggests that nearly half of the whole genome duplicates have diverged in expression by a random degeneration process. The remaining pairs have more conserved gene expression than expected by chance, consistent with a role for selection under the constraints of gene balance. We hypothesize that duplicate gene preservation in Populus is driven by a combination of subfunctionalization of duplicate pairs and purifying selection favoring retention of genes encoding proteins with large numbers of interactions.
对多个被子植物基因组的比较分析表明,基因复制在许多基因家族的扩张和多样化中起作用。然而,经验数据和理论表明,全基因组复制和小规模复制事件在保留的基因对类型上存在差异。我们比较了最近的全基因组复制产生的基因重复与模式树种杨树(Populus trichocarpa)中的一组串联重复基因。我们使用了一组不同组织的微阵列表达分析和功能注释相结合的方法,评估了与两种类型的基因重复保存相关的因素。全基因组重复比串联重复长 700bp,在 20%的组织中表达。此外,某些功能类别在每一类重复中都有过表达。特别是,疾病抗性基因和受体样激酶通常在串联重复中出现,但在全基因组复制后明显保留不足,而全基因组重复对则富含信号转导级联和转录因子成员。重复对表达差异分布的形状表明,近一半的全基因组重复通过随机退化过程在表达上发生了分歧。其余的对具有比预期更高的基因表达保守性,这与在基因平衡约束下选择的作用一致。我们假设杨树中重复基因的保存是由重复对的亚功能化和有利于保留具有大量相互作用的蛋白质编码基因的纯化选择共同驱动的。