Daburon Virginie, Mella Sébastien, Plouhinec Jean-Louis, Mazan Sylvie, Crozatier Michèle, Vincent Alain
Centre de Biologie du Développement, UMR 5547 and IFR 109 CNRS/UPS, 118 route de Narbonne 31062 Toulouse cedex 4, France.
BMC Evol Biol. 2008 May 2;8:131. doi: 10.1186/1471-2148-8-131.
The increasing number of available genomic sequences makes it now possible to study the evolutionary history of specific genes or gene families. Transcription factors (TFs) involved in regulation of gene-specific expression are key players in the evolution of metazoan development. The low complexity COE (Collier/Olfactory-1/Early B-Cell Factor) family of transcription factors constitutes a well-suited paradigm for studying evolution of TF structure and function, including the specific question of protein modularity. Here, we compare the structure of coe genes within the metazoan kingdom and report on the mechanism behind a vertebrate-specific exon duplication.
COE proteins display a modular organisation, with three highly conserved domains : a COE-specific DNA-binding domain (DBD), an Immunoglobulin/Plexin/transcription (IPT) domain and an atypical Helix-Loop-Helix (HLH) motif. Comparison of the splice structure of coe genes between cnidariae and bilateriae shows that the ancestral COE DBD was built from 7 separate exons, with no evidence for exon shuffling with other metazoan gene families. It also confirms the presence of an ancestral H1LH2 motif present in all COE proteins which partly overlaps the repeated H2d-H2a motif first identified in rodent EBF. Electrophoretic Mobility Shift Assays show that formation of COE dimers is mediated by this ancestral motif. The H2d-H2a alpha-helical repetition appears to be a vertebrate characteristic that originated from a tandem exon duplication having taken place prior to the splitting between gnathostomes and cyclostomes. We put-forward a two-step model for the inclusion of this exon in the vertebrate transcripts.
Three main features in the history of the coe gene family can be inferred from these analyses: (i) each conserved domain of the ancestral coe gene was built from multiple exons and the same scattered structure has been maintained throughout metazoan evolution. (ii) There exists a single coe gene copy per metazoan genome except in vertebrates. The H2a-H2d duplication that is specific to vertebrate proteins provides an example of a novel vertebrate characteristic, which may have been fixed early in the gnathostome lineage. (iii) This duplication provides an interesting example of counter-selection of alternative splicing.
现有基因组序列数量的不断增加使得现在有可能研究特定基因或基因家族的进化历史。参与基因特异性表达调控的转录因子(TFs)是后生动物发育进化中的关键角色。转录因子的低复杂性COE(Collier/嗅觉-1/早期B细胞因子)家族构成了一个非常适合研究TF结构和功能进化的范例,包括蛋白质模块化的具体问题。在这里,我们比较了后生动物界内coe基因的结构,并报告了脊椎动物特有的外显子重复背后的机制。
COE蛋白呈现模块化组织,具有三个高度保守的结构域:一个COE特异性DNA结合结构域(DBD)、一个免疫球蛋白/丛蛋白/转录(IPT)结构域和一个非典型螺旋-环-螺旋(HLH)基序。刺胞动物和两侧对称动物之间coe基因剪接结构的比较表明,祖先COE DBD由7个独立的外显子组成,没有证据表明与其他后生动物基因家族发生外显子重排。这也证实了所有COE蛋白中存在一个祖先H1LH2基序,它部分与在啮齿动物EBF中首次鉴定的重复H2d-H2a基序重叠。电泳迁移率变动分析表明,COE二聚体的形成是由这个祖先基序介导的。H2d-H2aα-螺旋重复似乎是脊椎动物的一个特征,它起源于在有颌类和圆口类动物分化之前发生的串联外显子重复。我们提出了一个两步模型来解释这个外显子在脊椎动物转录本中的包含。
从这些分析中可以推断出coe基因家族历史上的三个主要特征:(i)祖先coe基因的每个保守结构域由多个外显子组成,并且相同的分散结构在整个后生动物进化过程中得以维持。(ii)除脊椎动物外,每个后生动物基因组中存在一个单一的coe基因拷贝。脊椎动物蛋白特有的H2a-H2d重复提供了一个新的脊椎动物特征的例子,它可能在有颌类动物谱系中很早就固定下来了。(iii)这种重复提供了一个有趣的可变剪接反向选择的例子。