Shapiro S G, Schon E A, Townes T M, Lingrel J B
J Mol Biol. 1983 Sep 5;169(1):31-52. doi: 10.1016/s0022-2836(83)80174-0.
Overlapping clones containing beta-globin genes have been isolated from a goat genomic library which establish the linkage arrangement 5'-epsilon I-epsilon II-psi beta X-beta C-3'. The complete nucleotide sequence of the epsilon I and epsilon II genes was determined. The sequences of these two genes, along with those previously reported for psi beta X and beta C, complete the sequence of the genes of this linkage set. The first gene in the quadruplet, epsilon I, shows unexpectedly high homology with the human epsilon globin gene both in coding and non-coding regions, and encodes a globin protein that is 90% homologous to human epsilon. The only major difference between the goat epsilon I gene and the human epsilon gene is the presence of an insertion element in the second intron of epsilon I. This element is repetitive in nature and is similar to those found in the second intron of the gamma, beta C and beta A globin genes of the goat. epsilon II also shows high nucleotide homology to the human epsilon globin gene in coding regions and encodes a protein 79% homologous to human epsilon. Notably, however, epsilon II has equivalent nucleotide homology in coding regions to the gamma and epsilon genes of the human locus. The insertion element present in epsilon I is not present in epsilon II. A comparison of the goat beta globin set described here, based on linkage arrangement, nucleotide homology and divergence analysis indicates that this subset of goat beta globin genes is analogous to the entire beta globin loci of other mammalian species. These analyses further indicate that the embryonic genes in these clusters are evolving more slowly than the adult beta globin genes. Comparison of the 5' flanking sequences of epsilon I and epsilon II with those of the beta-embryonic globin genes of other mammals reveals a conserved sequence, C-A-C-C-C-C-T-G, located 28 to 29 bases upstream from the C-C-A-A-T consensus sequence, which appears at this position in the embryonic genes, but in none of the non-embryonic genes. Significantly, this sequence is selectively conserved in the human alpha embryonic globin gene, zeta, which diverged from the beta embryonic genes 500 million years ago, and it may therefore represent an embryonic recognition or signal sequence.
已从山羊基因组文库中分离出包含β-珠蛋白基因的重叠克隆,这些克隆确定了基因连锁排列顺序为5'-εI-εII-ψβX-βC-3'。测定了εI和εII基因的完整核苷酸序列。这两个基因的序列,连同先前报道的ψβX和βC基因的序列,完成了该连锁组基因的序列测定。四重基因中的第一个基因εI,在编码区和非编码区与人类ε珠蛋白基因均显示出意外的高度同源性,并编码一种与人类ε珠蛋白90%同源的珠蛋白。山羊εI基因与人类ε基因之间唯一的主要差异是εI的第二个内含子中存在一个插入元件。该元件本质上是重复的,与在山羊γ、βC和βA珠蛋白基因的第二个内含子中发现的元件相似。εII在编码区也与人类ε珠蛋白基因显示出高度核苷酸同源性,并编码一种与人类ε珠蛋白79%同源的蛋白质。然而,值得注意的是,εII在编码区与人类基因座的γ和ε基因具有同等的核苷酸同源性。εI中存在的插入元件在εII中不存在。基于连锁排列、核苷酸同源性和分歧分析,对这里描述的山羊β珠蛋白组进行比较表明,山羊β珠蛋白基因的这个子集类似于其他哺乳动物物种的整个β珠蛋白基因座。这些分析进一步表明,这些簇中的胚胎基因比成年β珠蛋白基因进化得更慢。将εI和εII的5'侧翼序列与其他哺乳动物的β胚胎珠蛋白基因的序列进行比较,发现一个保守序列C-A-C-C-C-C-T-G,位于共有序列C-C-A-A-T上游28至29个碱基处,该共有序列出现在胚胎基因的这个位置,但在非胚胎基因中均未出现。值得注意的是,该序列在5亿年前与β胚胎基因分化的人类α胚胎珠蛋白基因ζ中被选择性地保留,因此它可能代表一个胚胎识别或信号序列。