Catasti P, Chen X, Moyzis R K, Bradbury E M, Gupta G
Theoretical Biology and Biophysics, Los Alamos National Laboratory, NM 87545, USA.
J Mol Biol. 1996 Dec 6;264(3):534-45. doi: 10.1006/jmbi.1996.0659.
The insulin minisatellite of the insulin-linked polymorphic region (ILPR), a 14 base-pairs long tandem repeat of: 5'-ACAGGGGTGTGGGG-3' 3'-TGTCCCCACACCCC-5', is located 363 base-pairs upstream of the human insulin gene. A locus for insulin-dependent diabetes mellitus (IDDM) has been mapped to the ILPR. It has been shown that the ILPR is polymorphic in length and this length polymorphism is also related to the transcriptional activity of the insulin gene and the susceptibility to IDDM. Here, we attempt to decipher the role of the ILPR structure in length polymorphism and transcriptional regulation. We show by gel electrophoresis, circular dichroism (CD) and one and two-dimensional nuclear magnetic resonance spectroscopy (1D/2D NMR) that the G-rich strand of the ILPR adopts an intramolecularly folded hairpin G-quartet structure. A detailed analysis of 1D/2D NMR data of d(G4TGTG4) and d(G4TGTG4ACAG4TGTG4) enables us to define the nature of chainfolding, the stacking interaction of the G-tetrads in the stem, and the interactions of the bases in the loops. d(G4TGTG4ACAG4TGTG4) happens to be the smallest unit of the G-rich strand that can form the intramolecular hairpin G-quartet structure. For long ILPR sequences, several such hairpin G-quartet structures can be linked in space. Indeed, by an in vitro replication assay, we show the presence of such multiple hairpin G-quartet structures for the G-rich strand of the ILPR of repeat length 6. This observation suggests that the formation of multiple hairpin G-quartets may explain slippage during replication and the observed length polymorphism. From our high resolution structure, we are able to identify a set of interactions that are critical for the structure and stability of the hairpin G-quartet. Single or double mutations in the ILPR that destabilize these interactions also lower the transcriptional activity of the insulin gene. Therefore, the hairpin G-quartet structure of the ILPR has a direct correlation with the transcriptional activity of the human insulin gene.
胰岛素连锁多态性区域(ILPR)的胰岛素小卫星,是一段由14个碱基对组成的串联重复序列:5'-ACAGGGGTGTGGGG-3' 3'-TGTCCCCACACCCC-5',位于人类胰岛素基因上游363个碱基对处。胰岛素依赖型糖尿病(IDDM)的一个基因座已被定位到ILPR。研究表明,ILPR在长度上具有多态性,这种长度多态性也与胰岛素基因的转录活性以及对IDDM的易感性有关。在此,我们试图解读ILPR结构在长度多态性和转录调控中的作用。我们通过凝胶电泳、圆二色性(CD)以及一维和二维核磁共振光谱(1D/2D NMR)表明,ILPR富含G的链形成了分子内折叠的发夹G-四联体结构。对d(G4TGTG4)和d(G4TGTG4ACAG4TGTG4)的1D/2D NMR数据进行详细分析,使我们能够确定链折叠的性质、茎中G-四联体的堆积相互作用以及环中碱基的相互作用。d(G4TGTG4ACAG4TGTG4)恰好是富含G的链中能够形成分子内发夹G-四联体结构的最小单元。对于长的ILPR序列,几个这样的发夹G-四联体结构可以在空间上相连。实际上,通过体外复制试验,我们证明了重复长度为6的ILPR富含G的链存在这种多个发夹G-四联体结构。这一观察结果表明,多个发夹G-四联体的形成可能解释复制过程中的滑动以及观察到的长度多态性。从我们的高分辨率结构中,我们能够识别出一组对发夹G-四联体的结构和稳定性至关重要的相互作用。ILPR中使这些相互作用不稳定的单突变或双突变也会降低胰岛素基因的转录活性。因此,ILPR的发夹G-四联体结构与人类胰岛素基因的转录活性直接相关。