Li Can, Wu Qiang
Department of Human Genetics, University of Utah, Salt Lake City, Utah 84112, USA.
BMC Evol Biol. 2007 May 2;7:69. doi: 10.1186/1471-2148-7-69.
The human genome contains a large number of gene clusters with multiple-variable-first exons, including the drug-metabolizing UDP glucuronosyltransferase (UGT1) and I-branching beta-1,6-N-acetylglucosaminyltransferase (GCNT2, also known as IGNT) clusters, organized in a tandem array, similar to that of the protocadherin (PCDH), immunoglobulin (IG), and T-cell receptor (TCR) clusters. To gain insight into the evolutionary processes that may have shaped their diversity, we performed comprehensive comparative analyses for vertebrate multiple-variable-first-exon clusters.
We found that there are species-specific variable-exon duplications and mutations in the vertebrate Ugt1, Gcnt2, and Ugt2a clusters and that their variable and constant genomic organizations are conserved and vertebrate-specific. In addition, analyzing the complete repertoires of closely-related Ugt2 clusters in humans, mice, and rats revealed extensive lineage-specific duplications. In contrast to the Pcdh gene clusters, gene conversion does not play a predominant role in the evolution of the vertebrate Ugt1, Gcnt2 and Ugt2 gene clusters. Thus, their tremendous diversity is achieved through "birth-and-death" evolution. Comparative analyses and homologous modeling demonstrated that vertebrate UGT proteins have similar three-dimensional structures each with N-terminal and C-terminal Rossmann-fold domains binding acceptor and donor substrates, respectively. Molecular docking experiments identified key residues in donor and acceptor recognition and provided insight into the catalytic mechanism of UGT glucuronidation, suggesting the human UGT1A1 residue histidine 39 (H39) as a general base and the residue aspartic acid 151 (D151) as an important electron-transfer helper. In addition, we identified four hypervariable regions in the N-terminal Rossmann domain that form an acceptor-binding pocket. Finally, analyzing patterns of nonsynonymous and synonymous nucleotide substitutions identified codon sites that are subject to positive Darwinian selection at the molecular level. These diversified residues likely play an important role in recognition of myriad xenobiotics and endobiotics.
Our results suggest that enormous diversity of vertebrate multiple variable first exons is achieved through birth-and-death evolution and that adaptive evolution of specific codon sites enhances vertebrate UGT diversity for defense against environmental agents. Our results also have interesting implications regarding the staggering molecular diversity required for chemical detoxification and drug clearance.
人类基因组包含大量具有多个可变首个外显子的基因簇,包括药物代谢的尿苷二磷酸葡萄糖醛酸基转移酶(UGT1)和I分支β-1,6-N-乙酰氨基葡萄糖转移酶(GCNT2,也称为IGNT)基因簇,它们以串联阵列形式组织,类似于原钙黏蛋白(PCDH)、免疫球蛋白(IG)和T细胞受体(TCR)基因簇。为了深入了解可能塑造其多样性的进化过程,我们对脊椎动物多个可变首个外显子基因簇进行了全面的比较分析。
我们发现脊椎动物的Ugt1、Gcnt2和Ugt2a基因簇存在物种特异性的可变外显子重复和突变,并且它们的可变和恒定基因组组织是保守的且具有脊椎动物特异性。此外,对人类、小鼠和大鼠中密切相关的Ugt2基因簇的完整基因库进行分析,发现了广泛的谱系特异性重复。与Pcdh基因簇不同,基因转换在脊椎动物Ugt1、Gcnt2和Ugt2基因簇的进化中并不起主要作用。因此,它们的巨大多样性是通过“生死”进化实现的。比较分析和同源建模表明,脊椎动物UGT蛋白具有相似的三维结构,每个结构都有分别结合受体和供体底物的N端和C端罗斯曼折叠结构域。分子对接实验确定了供体和受体识别中的关键残基,并深入了解了UGT葡萄糖醛酸化的催化机制,表明人类UGT1A1残基组氨酸39(H39)作为通用碱,天冬氨酸151(D151)残基作为重要的电子转移辅助因子。此外,我们在N端罗斯曼结构域中确定了四个高变区,它们形成了一个受体结合口袋。最后,分析非同义与同义核苷酸替换模式,确定了在分子水平上受到正达尔文选择的密码子位点。这些多样化的残基可能在识别无数的外源性和内源性物质中起重要作用。
我们的结果表明,脊椎动物多个可变首个外显子的巨大多样性是通过生死进化实现的,并且特定密码子位点的适应性进化增强了脊椎动物UGT的多样性,以抵御环境因子。我们的结果对于化学解毒和药物清除所需的惊人分子多样性也具有有趣的启示。