Gong Q H, Cho J W, Huang T, Potter C, Gholami N, Basu N K, Kubota S, Carvalho S, Pennington M W, Owens I S, Popescu N C
Heritable Disorders Branch, National Institute of Child Health and Hunan Development, National Institutes of Health, Bethesda, MD 20892, USA.
Pharmacogenetics. 2001 Jun;11(4):357-68. doi: 10.1097/00008571-200106000-00011.
The original novel UGT1 complex locus previously shown to encode six different UDP-glucuronosyltransferase (transferase) genes has been extended and demonstrated to specify a total of 13 isoforms. The genes are designated UGT1A1 through UGT1A13p with four pseudo ones. UGT1A2p and UGT1A11p through UGT1A13p have either nucleotide deletions or flawed TATA boxes and are therefore pseudo. In the 5' region of the locus, the 13 unique exons 1 are arranged in a tandem array with each having its own proximal TATA box element and, in turn, are linked to four common exons to allow for the independent transcriptional initiation to generate overlapping primary transcripts. Only the lead exon in the nine viable primary transcripts is predicted to undergo splicing to the four common exons generating mRNAs with identical 3' ends and transferase isozymes with an identical carboxyl terminus. The unique amino terminus specifies acceptor-substrate selection, and the common carboxyl terminus apparently specifies the interaction with the common donor substrate, UDP-glucuronic acid. In the extended region, the viable TATA boxes are either A(A)TgA(AA)T or AT14AT; in the original locus the element for UGT1A1 is A(TA)7A and TAATT/CAA(A) for all of the other genes. UGT1A1 specifies the critically important bilirubin transferase isoform. The relationships of the exons 1 to each other are as follows: UGT1A2p through UGT1A5 comprises a cluster A that is 87-92% identical, and UGT1A7 through UGT1A13p comprises a cluster B that is 67-91% identical. For the two not included in a cluster, UGT1A1 is more identical to cluster A at 60-63%, whereas UGT1A6 is identical by between 48% and 56% to all other unique exons. The locus was expanded from 95 kb to 218 kb. Extensive probing of clones beyond 218 kb with coding nucleotides for a highly conserved amino acid sequence present in all transferases was unable to detect other exons 1. The mRNAs are differentially expressed in hepatic and extrahepatic tissues. This locus is indeed novel, indicating the least usage of exon sequences in specifying different transferase isozymes that have an expansive substrate range.
最初的UGT1复合基因座先前已显示可编码6种不同的UDP - 葡萄糖醛酸基转移酶(转移酶)基因,现已得到扩展,并证明总共可指定13种同工型。这些基因被命名为UGT1A1至UGT1A13p,其中有4个假基因。UGT1A2p和UGT1A11p至UGT1A13p存在核苷酸缺失或有缺陷的TATA框,因此是假基因。在该基因座的5'区域,13个独特的外显子1以串联排列,每个外显子都有自己的近端TATA框元件,进而与4个共同外显子相连,以允许独立的转录起始,产生重叠的初级转录本。预计9个有活性的初级转录本中只有前导外显子会与4个共同外显子进行剪接,产生具有相同3'末端的mRNA和具有相同羧基末端的转移酶同工型。独特的氨基末端决定受体 - 底物的选择,而共同的羧基末端显然决定与共同供体底物UDP - 葡萄糖醛酸的相互作用。在扩展区域,有活性的TATA框为A(A)TgA(AA)T或AT14AT;在原始基因座中,UGT1A1的元件为A(TA)7A,其他所有基因的元件为TAATT/CAA(A)。UGT1A1指定了至关重要的胆红素转移酶同工型。外显子1之间的关系如下:UGT1A2p至UGT1A5组成一个A簇,其同一性为87 - 92%,UGT1A7至UGT1A13p组成一个B簇,其同一性为67 - 91%。对于未包含在簇中的两个基因,UGT1A1与A簇的同一性为60 - 63%,而UGT1A6与所有其他独特外显子的同一性在48%至56%之间。该基因座从95 kb扩展到了218 kb。用所有转移酶中存在的高度保守氨基酸序列的编码核苷酸对超过218 kb的克隆进行广泛探测,未能检测到其他外显子1。这些mRNA在肝组织和肝外组织中差异表达。这个基因座确实是新的,表明在指定具有广泛底物范围的不同转移酶同工型时,外显子序列的使用最少。