Majewski Jacek, Ott Jurg
Rockefeller University, New York, New York 10021, USA.
Genome Res. 2002 Dec;12(12):1827-36. doi: 10.1101/gr.606402.
The regulation of transcription and subsequent gene splicing are crucial to correct gene expression. Although a number of regulatory sequences involved in both processes are known, it is not clear how general their functions are in the genomic context, nor how the regulatory regions are distributed throughout the genome. Here we study the distribution of known mutagenic elements within human introns and exons to deduce the properties of regions essential for splicing and transcription. We show that intronic splicing regulators are generally found close to the splice sites, but may be found as far as 200 nucleotides away from the splice junctions. Similarly, sequences important for splicing may be located as far as 125 nucleotides away from the junctions, within exons. We characterize several types of simple repetitive sequences and low-complexity regions that are overrepresented close to both intron ends and are likely to play important roles in the splicing process. We show that the first introns within most genes play a particularly important regulatory role that is most likely, however, to be involved in transcription control. We also study the distribution of two known regulatory motifs, the GGG trinucleotide and the CpG dinucleotide, and deduce their respective importance to splicing and transcription regulation.
转录调控及随后的基因剪接对于正确的基因表达至关重要。尽管已知参与这两个过程的一些调控序列,但尚不清楚它们在基因组背景下的功能有多普遍,也不清楚调控区域在整个基因组中是如何分布的。在此,我们研究人类内含子和外显子中已知诱变元件的分布,以推断剪接和转录所必需区域的特性。我们发现内含子剪接调节因子通常位于剪接位点附近,但也可能在距剪接连接点达200个核苷酸的位置被发现。同样,对剪接重要的序列可能位于外显子内距连接点达125个核苷酸的位置。我们对几种类型的简单重复序列和低复杂性区域进行了特征描述,这些序列在靠近内含子两端处过度富集,并且可能在剪接过程中发挥重要作用。我们发现大多数基因中的首个内含子发挥着特别重要的调控作用,不过最有可能参与转录控制。我们还研究了两种已知调控基序——三联体GGG和二联体CpG的分布,并推断它们对剪接和转录调控各自的重要性。