Johnson Deborah A, Thomas Michael A
Department of Biological Sciences, Idaho State University, USA.
Mol Biol Evol. 2007 Nov;24(11):2412-23. doi: 10.1093/molbev/msm184. Epub 2007 Sep 6.
Current hypotheses of gene duplicate divergence propose that surviving members of a gene duplicate pair may evolve, under conditions of purifying or nearly neutral selection, in one of two ways: with new function arising in one duplicate while the other retains original function (neofunctionalization [NF]) or partitioning of the original function between the 2 paralogs (subfunctionalization [SF]). More recent studies propose that SF followed by NF (subneofunctionalization [SNF]) explains the divergence of many duplicate genes. In this analysis, we evaluate these hypotheses in the context of the large monosaccharide transporter (MST) gene families in Arabidopsis and rice. MSTs have an ancient origin, predating plants, and have evolved in the seed plant lineage to comprise 7 subfamilies. In Arabidopsis, 53 putative MST genes have been identified, with one subfamily greatly expanded by tandem gene duplications. We searched the rice genome for members of the MST gene family and compared them with the MST gene family in Arabidopsis to determine subfamily expansion patterns and estimate gene duplicate divergence times. We tested hypotheses of gene duplicate divergence in 24 paralog pairs by comparing protein sequence divergence rates, estimating positive selection on codon sites, and analyzing tissue expression patterns. Results reveal the MST gene family to be significantly larger (65) in rice with 2 subfamilies greatly expanded by tandem duplications. Gene duplicate divergence time estimates indicate that early diversification of most subfamilies occurred in the Proterozoic (2500-540 Myr) and that expansion of large subfamilies continued through the Cenozoic (65-0 Myr). Two-thirds of paralog pairs show statistically symmetric rates of sequence evolution, most consistent with the SF model, with half of those showing evidence for positive selection in one or both genes. Among 8 paralog pairs showing asymmetric divergence rates, most consistent with the NF model, nearly half show evidence of positive selection. Positive selection does not appear in any duplicate pairs younger than approximately 34 Myr. Our data suggest that the NF, SF, and SNF models describe different outcomes along a continuum of divergence resulting from initial conditions of relaxed constraint after duplication.
当前关于基因复制分化的假说提出,基因复制对中存活下来的成员可能在纯化或近乎中性选择的条件下,以两种方式之一进化:一个复制体产生新功能,而另一个保留原始功能(新功能化[NF]),或者原始功能在两个旁系同源基因之间进行分配(亚功能化[SF])。最近的研究提出,先亚功能化后新功能化(亚新功能化[SNF])解释了许多复制基因的分化。在本分析中,我们在拟南芥和水稻的大型单糖转运蛋白(MST)基因家族背景下评估这些假说。MST起源古老,早于植物出现,并在种子植物谱系中进化,形成了7个亚家族。在拟南芥中,已鉴定出53个假定的MST基因,其中一个亚家族通过串联基因复制大幅扩张。我们在水稻基因组中搜索MST基因家族成员,并将它们与拟南芥中的MST基因家族进行比较,以确定亚家族扩张模式并估计基因复制分化时间。我们通过比较蛋白质序列分化率、估计密码子位点上的正选择以及分析组织表达模式,对24对旁系同源基因的基因复制分化假说进行了检验。结果显示,水稻中的MST基因家族显著更大(65个),有2个亚家族通过串联复制大幅扩张。基因复制分化时间估计表明,大多数亚家族的早期分化发生在元古代(25亿-5.4亿年前),大型亚家族的扩张一直持续到新生代(6500万-0年前)。三分之二的旁系同源基因对显示出序列进化速率在统计学上是对称的,这与亚功能化模型最为一致,其中一半显示出一个或两个基因存在正选择的证据。在8对显示不对称分化速率的旁系同源基因对中,这与新功能化模型最为一致,近一半显示出正选择的证据。在任何年龄小于约3400万年前的复制基因对中均未出现正选择。我们的数据表明,新功能化、亚功能化和亚新功能化模型描述了复制后初始约束放松条件下沿着分化连续体的不同结果。