Matsushita Tatsuo, Kano-Sueoka Tamiko
, 1508 Fuqua Drive, Fort Collins, CO, 80521, USA.
Molecular Cellular and Developmental Biology, University of Colorado Boulder, Boulder, CO, 80309, USA.
J Mol Evol. 2023 Apr;91(2):169-191. doi: 10.1007/s00239-023-10093-5. Epub 2023 Feb 21.
The structure and function of human leucocyte antigen (HLA-A) is well known and is an extremely variable protein. From the public HLA-A database, we chose 26 high frequency HLA-A alleles (45% of sequenced alleles). Using five arbitrary references from these alleles, we analyzed synonymous mutations at the third codon position (sSNP) and non-synonymous mutations (NSM). Both mutation types showed non-random locations of 29 sSNP codons and 71 NSM codons in the five reference lists. Most sSNP codons show identical mutation types with many mutations resulting from cytosine deamination. We proposed 23 ancestral parents of sSNP in five reference sequences using conserved parents in five unidirectional codons and 18 majority parents in reciprocal codons. These 23 proposed ancestral parents show exclusive codon usage of G or C parents located on both DNA strands that mutate to A or T variants mostly (76%) by cytosine deamination The sSNP and NSM show clear separation of the two variant types with most sSNP located in conserved areas in exons 2, 3 and 4, compared to most NSM appearing in two Variable Areas with no sSNP in the latter parts of exons 2 (α1) and 3 (α2). The Variable Areas contain NSM (polymorphic) residues at the center of the groove that bind the foreign peptide. We find distinctly different mutation patterns in NSM codons from those of sSNP. Namely, G-C to A-T mutation frequency was much smaller, suggesting that evolutional pressures of deamination and other mechanisms applied to the two areas are significantly different.
人类白细胞抗原(HLA - A)的结构和功能已为人熟知,它是一种极具变异性的蛋白质。我们从公共HLA - A数据库中选取了26个高频HLA - A等位基因(占测序等位基因的45%)。利用这些等位基因中的五个任意参考序列,我们分析了第三密码子位置的同义突变(sSNP)和非同义突变(NSM)。在这五个参考列表中,两种突变类型的29个sSNP密码子和71个NSM密码子均显示出非随机分布。大多数sSNP密码子呈现相同的突变类型,许多突变是由胞嘧啶脱氨导致的。我们利用五个单向密码子中的保守亲本和反向密码子中的18个多数亲本,在五个参考序列中提出了23个sSNP的祖先亲本。这23个提出的祖先亲本显示出位于两条DNA链上的G或C亲本的排他性密码子使用情况,它们大多(76%)通过胞嘧啶脱氨突变为A或T变体。sSNP和NSM显示出两种变体类型的明显分离,与大多数NSM出现在两个可变区且外显子2(α1)和3(α2)的后半部分没有sSNP相比,大多数sSNP位于外显子2、3和4的保守区域。可变区在结合外来肽的凹槽中心含有NSM(多态性)残基。我们发现NSM密码子的突变模式与sSNP明显不同。也就是说,G - C到A - T的突变频率要小得多,这表明作用于这两个区域的脱氨和其他机制的进化压力显著不同。