Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, United States.
Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY, United States.
Front Immunol. 2021 May 10;12:671944. doi: 10.3389/fimmu.2021.671944. eCollection 2021.
Activation-induced deaminase (AID) is a key enzyme involved in antibody diversification by initiating somatic hypermutation (SHM) and class-switch recombination (CSR) of the Immunoglobulin (Ig) loci. AID preferentially targets WRC (W=A/T, R=A/G) hotspot motifs and avoids SYC (S=C/G, Y=C/T) coldspots. G-quadruplex (G4) structures are four-stranded DNA secondary structures with key functions in transcription, translation and replication. studies have shown G4s to form and bind AID in Ig switch (S) regions. Alterations in the gene encoding AID can further disrupt AID-G4 binding and reduce CSR . However, it is still unclear whether G4s form in the variable (V) region, or how they may affect SHM. To assess the possibility of G4 formation in human V regions, we analyzed germline human Ig heavy chain V (IGHV) sequences, using a pre-trained deep learning model that predicts G4 potential. This revealed that many genes from the IGHV3 and IGHV4 families are predicted to have high G4 potential in the top and bottom strand, respectively. Different IGHV alleles also showed variability in G4 potential. Using a high-resolution (G4-seq) dataset of biochemically confirmed potential G4s in IGHV genes, we validated our computational predictions. G4-seq also revealed variation between S and V regions in the distribution of potential G4s, with the V region having overall reduced G4 abundance compared to the S region. The density of AGCT motifs, where two AGC hotspots overlap on both strands, was roughly 2.6-fold greater in the V region than the Constant (C) region, which does not mutate despite having predicted G4s at similar levels. However, AGCT motifs in both V and C regions were less abundant than in S regions. mutagenesis experiments showed that G4 potentials were generally robust to mutation, although large deviations from germline states were found, mostly in framework regions. G4 potential is also associated with higher mutability of certain WRC hotspots on the same strand. In addition, CCC coldspots opposite a predicted G4 were shown to be targeted significantly more for mutation. Our overall assessment reveals plausible evidence of functional G4s forming in the Ig V region.
激活诱导的脱氨酶(AID)是一种关键酶,通过启动免疫球蛋白(Ig)基因座的体细胞超突变(SHM)和类别转换重组(CSR),参与抗体多样性的产生。AID 优先靶向 WRC(W=A/T,R=A/G)热点基序,并避开 SYC(S=C/G,Y=C/T)冷点。G-四链体(G4)结构是具有关键转录、翻译和复制功能的四链 DNA 二级结构。研究表明,G4 可在 Ig 开关(S)区形成并结合 AID。AID 编码基因的改变进一步破坏了 AID-G4 结合并减少了 CSR。然而,目前尚不清楚 G4 是否在可变(V)区形成,或者它们如何影响 SHM。为了评估 G4 在人类 V 区形成的可能性,我们使用预先训练的深度学习模型分析了人类 Ig 重链 V(IGHV)基因的种系序列,该模型预测 G4 的潜力。结果表明,IGHV3 和 IGHV4 家族的许多基因在顶链和底链上分别具有高 G4 潜力。不同的 IGHV 等位基因在 G4 潜力上也存在差异。使用生物化学确认的IGHV 基因中潜在 G4 的高分辨率(G4-seq)数据集,我们验证了我们的计算预测。G4-seq 还揭示了 S 区和 V 区之间潜在 G4 分布的差异,与 S 区相比,V 区的 G4 丰度总体上降低。在 V 区中,两条 AGC 热点在两条链上重叠的 AGCT 基序的密度比不变区(C)高约 2.6 倍,尽管 C 区具有类似水平的预测 G4,但 C 区不发生突变。然而,V 区和 C 区的 AGCT 基序都比 S 区少。G4 突变实验表明,G4 潜力通常对突变具有鲁棒性,尽管发现了与种系状态的较大偏差,主要在框架区。G4 潜力还与同一链上某些 WRC 热点的更高突变率相关。此外,与预测的 G4 相对的 CCC 冷点被证明显著更容易成为突变的靶标。我们的整体评估揭示了 Ig V 区形成功能 G4 的合理证据。