Zhang Ren, Zhang Chun-Ting
Department of Epidemiology and Biostatistics, Tianjin Cancer Institute and Hospital, Tianjin 300060, China.
Bioinformatics. 2004 Mar 22;20(5):612-22. doi: 10.1093/bioinformatics/btg453. Epub 2004 Jan 22.
Some genomic islands contain horizontally transferred genes, which play critical roles in altering the genotypes and phenotypes of organisms, and horizontal gene transfer has been recognized as a universal event throughout bacterial evolution. A windowless method to display the distribution of genomic GC content, the cumulative GC profile, is proposed to identify genomic islands in genomes whose complete genome sequences are available. Two new indices are proposed to assess the codon usage bias and amino acid usage bias in genomic islands.
A 211 kb genomic island (CGGI-1) has been identified in the genome of Corynebacterium glutamicum, and three genomic islands VVGI-1, VVGI-2 and VVGI-3, with lengths 167, 40 and 33 kb, respectively, have been identified in the genome of Vibrio vulnificus CMCP6 chromosome I. The CGGI-1 is flanked by two approximately 500 bp direct repeats, and utilizes a Val-tRNA as the integration site. For the VVGI-1 and VVGI-2, each has an integrase gene at 5' junction. All the identified genomic islands show unusual GC content, codon usage and amino acid usage, compared with the rest of the genomes. In addition, it is found that genomic islands are fairly homogenous in terms of GC content variation. An index, h, to quantify the homogeneity of GC content for genomic islands is proposed, and it is shown that h is less than 0.1 for all the genomic islands analyzed. The cumulative GC profile, as well as various indices to assess the codon usage bias, amino acid usage bias and homogeneity of the genomic islands, will be useful in the analysis of other genomes.
Programs used in this work and numerical results are available upon request.
一些基因组岛包含水平转移基因,这些基因在改变生物体的基因型和表型方面发挥着关键作用,并且水平基因转移在细菌进化过程中被认为是一个普遍事件。本文提出了一种无窗口方法——累积GC分布图,用于展示基因组GC含量分布,以识别具有完整基因组序列的基因组中的基因组岛。同时提出了两个新指标来评估基因组岛中的密码子使用偏好和氨基酸使用偏好。
在谷氨酸棒杆菌基因组中鉴定出一个211 kb的基因组岛(CGGI - 1),在创伤弧菌CMCP6染色体I基因组中鉴定出三个基因组岛VVGI - 1、VVGI - 2和VVGI - 3,长度分别为167、40和33 kb。CGGI - 1两侧有两个约500 bp的正向重复序列,并利用一个缬氨酸tRNA作为整合位点。对于VVGI - 1和VVGI - 2,每个在5'端连接处都有一个整合酶基因。与基因组的其余部分相比,所有鉴定出的基因组岛都显示出异常的GC含量、密码子使用和氨基酸使用情况。此外,发现基因组岛在GC含量变化方面相当均匀。提出了一个指标h来量化基因组岛GC含量的均匀性,结果表明,对于所有分析的基因组岛,h均小于0.1。累积GC分布图以及评估基因组岛密码子使用偏好、氨基酸使用偏好和均匀性的各种指标,将有助于其他基因组的分析。
本研究中使用的程序和数值结果可根据要求提供。