Vinogradov Alexander E
Institute of Cytology, Russian Academy of Sciences, Tikhoretsky Avenue 4, St Petersburg 194064, Russia.
Nucleic Acids Res. 2003 Sep 1;31(17):5212-20. doi: 10.1093/nar/gkg699.
The housekeeping (ubiquitously expressed) genes in the mammal genome were shown here to be on average slightly GC-richer than tissue-specific genes. Both housekeeping and tissue-specific genes occupy similar ranges of GC content, but the former tend to concentrate in the upper part of the range. In the human genome, tissue-specific genes show two maxima, GC-poor and GC-rich. The strictly tissue-specific human genes tend to concentrate in the GC-poor region; their distribution is left-skewed and thus reciprocal to the distribution of housekeeping genes. The intermediately tissue-specific genes show an intermediate GC content and the right-skewed distribution. Both in the human and mouse, genes specific for some tissues (e.g., parts of the central nervous system) have a higher average GC content than housekeeping genes. Since they are not transcribed in the germ line (in contrast to housekeeping genes), and therefore have a lower probability of inheritable gene conversion, this finding contradicts the biased gene conversion (BGC) explanation for elevated GC content in the heavy isochores of mammal genome. Genes specific for germ-line tissues (ovary, testes) show a low average GC content, which is also in contradiction to the BGC explanation. Both for the total data set and for the most part of tissues taken separately, a weak positive correlation was found between gene GC content and expression level. The fraction of ubiquitously expressed genes is nearly 1.5-fold higher in the mouse than in the human. This suggests that mouse tissues are comparatively less differentiated (on the molecular level), which can be related to a less pronounced isochoric structure of the mouse genome. In each separate tissue (in both species), tissue-specific genes do not form a clear-cut frequency peak (in contrast to housekeeping genes), but constitute a continuum with a gradually increasing degree of tissue-specificity, which probably reflects the path of cell differentiation and/or an independent use of the same protein in several unrelated tissues.
哺乳动物基因组中的管家基因(普遍表达)在此处显示平均而言比组织特异性基因的GC含量略高。管家基因和组织特异性基因的GC含量范围相似,但前者倾向于集中在该范围的上部。在人类基因组中,组织特异性基因呈现两个峰值,即GC含量低和GC含量高的区域。严格意义上的组织特异性人类基因倾向于集中在GC含量低的区域;它们的分布向左偏斜,因此与管家基因的分布相反。中等程度组织特异性的基因显示出中等的GC含量和向右偏斜的分布。在人类和小鼠中,某些组织(如中枢神经系统的部分区域)特异性的基因平均GC含量都高于管家基因。由于它们不在生殖系中转录(与管家基因不同),因此发生可遗传基因转换的概率较低,这一发现与哺乳动物基因组重等密度区GC含量升高的偏向基因转换(BGC)解释相矛盾。生殖系组织(卵巢、睾丸)特异性的基因平均GC含量较低,这也与BGC解释相矛盾。无论是对于整个数据集还是分别考虑的大部分组织,都发现基因GC含量与表达水平之间存在微弱的正相关。在小鼠中普遍表达的基因比例比人类高近1.5倍。这表明小鼠组织在分子水平上的分化程度相对较低,这可能与小鼠基因组中不那么明显的等密度结构有关。在每个单独的组织中(在两个物种中都是如此),组织特异性基因不会形成明显的频率峰值(与管家基因不同),而是构成一个组织特异性程度逐渐增加的连续体,这可能反映了细胞分化的路径和/或在几个不相关组织中对同一蛋白质的独立使用。