Hardison R, Krane D, Vandenbergh D, Cheng J F, Mansberger J, Taddie J, Schwartz S, Huang X Q, Miller W
Department of Molecular and Cell Biology, Pennsylvania State University, University Park 16802.
J Mol Biol. 1991 Nov 20;222(2):233-49. doi: 10.1016/0022-2836(91)90209-o.
A sequence of 10,621 base-pairs from the alpha-like globin gene cluster of rabbit has been determined. It includes the sequence of gene zeta 1 (a pseudogene for the rabbit embryonic zeta-globin), the functional rabbit alpha-globin gene, and the theta 1 pseudogene, along with the sequences of eight C repeats (short interspersed repeats in rabbit) and a J sequence implicated in recombination. The region is quite G + C-rich (62%) and contains two CpG islands. As expected for a very G + C-rich region, it has an abundance of open reading frames, but few of the long open reading frames are associated with the coding regions of genes. Alignments between the sequences of the rabbit and human alpha-like globin gene clusters reveal matches primarily in the immediate vicinity of genes and CpG islands, while the intergenic regions of these gene clusters have many fewer matches than are seen between the beta-like globin gene clusters of these two species. Furthermore, the non-coding sequences in this portion of the rabbit alpha-like globin gene cluster are shorter than in human, indicating a strong tendency either for sequence contraction in the rabbit gene cluster or for expansion in the human gene cluster. Thus, the intergenic regions of the alpha-like globin gene clusters have evolved in a relatively fast mode since the mammalian radiation, but not exclusively by nucleotide substitution. Despite this rapid mode of evolution, some strong matches are found 5' to the start sites of the human and rabbit alpha genes, perhaps indicating conservation of a regulatory element. The rabbit J sequence is over 1000 base-pairs long; it contains a C repeat at its 5' end and an internal region of homology to the 3'-untranslated region of the alpha-globin gene. Part of the rabbit J sequence matches with sequences within the X homology block in human. Both of these regions have been implicated as hot-spots for recombination, hence the matching sequences are good candidates for such a function. All the interspersed repeats within both gene clusters are retroposon SINEs that appear to have inserted independently in the rabbit and human lineages.
已确定了来自兔α-类珠蛋白基因簇的一段10621个碱基对的序列。它包括ζ1基因(兔胚胎ζ-珠蛋白的假基因)、功能性兔α-珠蛋白基因和θ1假基因的序列,以及八个C重复序列(兔中的短散在重复序列)和一个与重组有关的J序列。该区域富含G + C(62%),并包含两个CpG岛。正如富含G + C区域所预期的那样,它有大量的开放阅读框,但很少有长开放阅读框与基因的编码区域相关。兔和人α-类珠蛋白基因簇序列之间的比对显示,主要在基因和CpG岛的紧邻区域存在匹配,而这些基因簇的基因间区域的匹配比这两个物种的β-类珠蛋白基因簇之间的匹配要少得多。此外,兔α-类珠蛋白基因簇这一部分的非编码序列比人类的短,这表明兔基因簇中存在序列收缩的强烈趋势,或者人类基因簇中存在序列扩张的趋势。因此,自哺乳动物辐射以来,α-类珠蛋白基因簇的基因间区域以相对较快的模式进化,但并非仅通过核苷酸替换。尽管进化模式迅速,但在人和兔α基因起始位点的5'端发现了一些强匹配,这可能表明存在一个保守的调控元件。兔J序列长度超过1000个碱基对;它在其5'端包含一个C重复序列,并且有一个与α-珠蛋白基因3'-非翻译区同源的内部区域。兔J序列的一部分与人X同源框内的序列匹配。这两个区域都被认为是重组热点,因此匹配序列很可能具有这种功能。两个基因簇内的所有散在重复序列都是反转录转座子SINEs,它们似乎是在兔和人谱系中独立插入的。