Laboratory of Computational Oncology, Department of Medicine, University of Hong Kong, Pokfulam, Hong Kong, Hong Kong.
PLoS One. 2007 Jul 11;2(7):e603. doi: 10.1371/journal.pone.0000603.
Promoter-associated CpG islands (PCIs) mediate methylation-dependent gene silencing, yet tend to co-locate to transcriptionally active genes. To address this paradox, we used data mining to assess the behavior of PCI-positive (PCI+) genes in the human genome.
PCI+ genes exhibit a bimodal distribution: (1) a 'housekeeping-like' subset characterized by higher GC content and lower intron length/number, and (2) a 'pseudogene paralog' subset characterized by lower GC content and higher intron length/number (p<0.001). These subsets are functionally distinguishable, with the former gene group characterized by higher expression levels and lower evolutionary rate (p<0.001). PCI-negative (PCI-) genes exhibit higher evolutionary rate and narrower expression breadth than PCI+ genes (p<0.001), consistent with more frequent tissue-specific inactivation.
Adaptive evolution of the human genome appears driven in part by declining transcription of a subset of PCI+ genes, predisposing to both CpG-->TpA mutation and intron insertion. We propose a model of evolving biological complexity in which environmentally-selected gains or losses of PCI methylation respectively favor positive or negative selection, thus polarizing PCI+ gene structures around a genomic core of ancestral PCI- genes.
启动子相关的 CpG 岛(PCIs)介导了甲基化依赖的基因沉默,但往往与转录活跃的基因共同定位。为了解决这个矛盾,我们使用数据挖掘来评估人类基因组中 PCIP 阳性(PCIP+)基因的行为。
PCIP+基因呈现双峰分布:(1)一个“管家样”子集,其特征是更高的 GC 含量和更低的内含子长度/数量,以及(2)一个“假基因同源”子集,其特征是更低的 GC 含量和更高的内含子长度/数量(p<0.001)。这些子集在功能上是可区分的,前者的基因组具有更高的表达水平和更低的进化率(p<0.001)。PCIP-基因比 PCIP+基因具有更高的进化率和更窄的表达谱(p<0.001),这与更频繁的组织特异性失活一致。
人类基因组的适应性进化似乎部分是由一组 PCIP+基因转录的下降所驱动的,这使得 CpG->TpA 突变和内含子插入更容易发生。我们提出了一个不断进化的生物复杂性模型,其中环境选择的 PCIP 甲基化的获得或丢失分别有利于正选择或负选择,从而使 PCIP+基因结构围绕着祖先 PCIP-基因的基因组核心两极分化。