Gene Discovery Research Group, RIKEN Plant Science Center, Yokohama, Kanagawa, Japan.
PLoS Genet. 2009 Dec;5(12):e1000781. doi: 10.1371/journal.pgen.1000781. Epub 2009 Dec 24.
The differentiation of both gene expression and protein function is thought to be important as a mechanism of the functionalization of duplicate genes. However, it has not been addressed whether expression or protein divergence of duplicate genes is greater in those genes that have undergone functionalization compared with those that have not. We examined a total of 492 paralogous gene pairs associated with morphological diversification in a plant model organism (Arabidopsis thaliana). Classifying these paralogous gene pairs into high, low, and no morphological diversification groups, based on knock-out data, we found that the divergence rate of both gene expression and protein sequences were significantly higher in either high or low morphological diversification groups compared with those in the no morphological diversification group. These results strongly suggest that the divergence of both expression and protein sequence are important sources for morphological diversification of duplicate genes. Although both mechanisms are not mutually exclusive, our analysis suggested that changes of expression pattern play the minor role (33%-41%) and that changes of protein sequence play the major role (59%-67%) in morphological diversification. Finally, we examined to what extent duplicate genes are associated with expression or protein divergence exerting morphological diversification at the whole-genome level. Interestingly, duplicate genes randomly chosen from A. thaliana had not experienced expression or protein divergence that resulted in morphological diversification. These results indicate that most duplicate genes have experienced minor functionalization.
基因表达和蛋白质功能的分化被认为是重复基因功能特化的重要机制。然而,目前还不清楚在经历功能特化的重复基因与未经历功能特化的重复基因中,哪个基因的表达或蛋白质分化程度更大。我们研究了植物模式生物(拟南芥)中与形态多样化相关的总共 492 对同源基因对。基于敲除数据,我们将这些同源基因对分为高、低和无形态多样化组,发现无论是高还是低形态多样化组的基因表达和蛋白质序列的分歧率都明显高于无形态多样化组。这些结果强烈表明,表达和蛋白质序列的分歧是重复基因形态多样化的重要来源。虽然这两种机制并非相互排斥,但我们的分析表明,表达模式的改变只起次要作用(33%-41%),而蛋白质序列的改变起主要作用(59%-67%)在形态多样化中。最后,我们研究了在全基因组水平上,重复基因与表达或蛋白质分化之间存在多大程度的关联,从而导致形态多样化。有趣的是,从拟南芥中随机选择的重复基因并没有经历导致形态多样化的表达或蛋白质分化。这些结果表明,大多数重复基因经历了较小的功能特化。