Laudet V, Stehelin D, Clevers H
CNRS URA 1160, Institut Pasteur, Lille, France.
Nucleic Acids Res. 1993 May 25;21(10):2493-501. doi: 10.1093/nar/21.10.2493.
The HMG box is a novel type of DNA-binding domain found in a diverse group of proteins. The HMG box superfamily comprises a.o. the High Mobility Group proteins HMG1 and HMG2, the nucleolar transcription factor UBF, the lymphoid transcription factors TCF-1 and LEF-1, the fungal mating-type genes mat-Mc and MATA1, and the mammalian sex-determining gene SRY. The superfamily dates back to at least 1,000 million years ago, as its members appear in animals, plants and yeast. Alignment of all known HMG boxes defined an unusually loose consensus sequence. We constructed phylogenetic trees connecting the members of the HMG box superfamily in order to understand their evolution. This analysis led us to distinguish two subfamilies: one comprising proteins with a single sequence-specific HMG box, the other encompassing relatively non sequence-specific DNA-binding proteins with multiple HMG boxes. By studying the extent of diversification of the superfamily, we found that the speed of evolution was very different within the various groups of HMG-box containing factors. Comparison of the evolution of the two boxes of ABF2 and of mtTF1 implied different diversification models for these two proteins. Finally, we provide a tree for the highly complex group of SRY-like ('Sox' genes), clustering at least 40 different loci that rapidly diverged in various animal lineages.
HMG框是在多种蛋白质中发现的一种新型DNA结合结构域。HMG框超家族包括高迁移率族蛋白HMG1和HMG2、核仁转录因子UBF、淋巴样转录因子TCF-1和LEF-1、真菌交配型基因mat-Mc和MATA1以及哺乳动物性别决定基因SRY等。该超家族至少可追溯到10亿年前,因为其成员出现在动物、植物和酵母中。对所有已知HMG框的比对确定了一个异常宽松的共有序列。我们构建了连接HMG框超家族成员的系统发育树,以了解它们的进化情况。该分析使我们区分出两个亚家族:一个亚家族包含具有单个序列特异性HMG框的蛋白质,另一个亚家族包含具有多个HMG框的相对非序列特异性DNA结合蛋白质。通过研究超家族的多样化程度,我们发现含有HMG框的因子的不同组内进化速度差异很大。对ABF2和mtTF1的两个框的进化比较暗示了这两种蛋白质不同的多样化模式。最后,我们为高度复杂的SRY样(“Sox”基因)组提供了一棵树,该组聚集了至少40个在各种动物谱系中迅速分化的不同基因座。