Ellsworth D L, Hewett-Emmett D, Li W H
Center for Demographic and Population Genetics, University of Texas Health Science Center at Houston 77225.
Mol Biol Evol. 1994 Nov;11(6):875-85. doi: 10.1093/oxfordjournals.molbev.a040170.
The genomes of homeothermic (warm-blooded) vertebrates are mosaic interspersions of homogeneously GC-rich and GC-poor regions (isochores). Evolution of genome compartmentalization and GC-rich isochores is hypothesized to reflect either selective advantages of an elevated GC content or chromosome location and mutational pressure associated with the timing of DNA replication in germ cells. To address the present controversy regarding the origins and maintenance of isochores in homeothermic vertebrates, newly obtained as well as published nucleotide sequences of the insulin and insulin-like growth factor (IGF) genes, members of a well-characterized gene family believed to have evolved by repeated duplication and divergence, were utilized to examine the evolution of base composition in nonconstrained (flanking) and weakly constrained (introns and fourfold degenerate sites) regions. A phylogeny derived from amino acid sequences supports a common evolutionary history for the insulin/IGF family genes. In cold-blooded vertebrates, insulin and the IGFs were similar in base composition. In contrast, insulin and IGF-II demonstrate dramatic increases in GC richness in mammals, but no such trend occurred in IGF-I. Base composition of the coding portions of the insulin and IGF genes across vertebrates correlated (r = 0.90) with that of the introns and flanking regions. The GC content of homologous introns differed dramatically between insulin/IGF-II and IGF-I genes in mammals but was similar to the GC level of noncoding regions in neighboring genes. Our findings suggest that the base composition of introns and flanking regions is determined by chromosomal location and the mutational pressure of the isochore in which the sequences are embedded. An elevated GC content at codon third positions in the insulin and the IGF genes may reflect selective constraints on the usage of synonymous codons.
恒温(温血)脊椎动物的基因组是富含GC和GC含量低的区域(等密度区)的镶嵌散布。基因组区室化和富含GC的等密度区的进化被认为反映了GC含量升高的选择优势,或者与生殖细胞中DNA复制时间相关的染色体位置和突变压力。为了解决目前关于恒温脊椎动物中等密度区的起源和维持的争议,利用新获得的以及已发表的胰岛素和胰岛素样生长因子(IGF)基因的核苷酸序列,这些基因是一个特征明确的基因家族的成员,被认为是通过重复复制和分化进化而来的,来研究非约束(侧翼)和弱约束(内含子和四倍简并位点)区域的碱基组成的进化。从氨基酸序列推导的系统发育支持胰岛素/IGF家族基因的共同进化历史。在冷血脊椎动物中,胰岛素和IGF在碱基组成上相似。相比之下,胰岛素和IGF-II在哺乳动物中显示出GC丰富度的显著增加,但IGF-I没有这种趋势。脊椎动物中胰岛素和IGF基因编码部分的碱基组成与内含子和侧翼区域的碱基组成相关(r = 0.90)。哺乳动物中胰岛素/IGF-II和IGF-I基因的同源内含子的GC含量差异很大,但与相邻基因非编码区的GC水平相似。我们的研究结果表明,内含子和侧翼区域的碱基组成由染色体位置和序列所嵌入的等密度区的突变压力决定。胰岛素和IGF基因密码子第三位的GC含量升高可能反映了对同义密码子使用的选择性限制。