Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY, United States.
Karches Center for Oncology Research, The Feinstein Institute for Medical Research, Northwell Health, Manhasset, NY, United States.
Front Immunol. 2020 Apr 30;11:788. doi: 10.3389/fimmu.2020.00788. eCollection 2020.
Somatic hypermutation (SHM) of the immunoglobulin variable (IgV) loci is a key process in antibody affinity maturation. The enzyme activation-induced deaminase (AID), initiates SHM by creating C → U mismatches on single-stranded DNA (ssDNA). AID has preferential hotspot motif targets in the context of WRC/GYW (W = A/T, R = A/G, Y = C/T) and particularly at WGCW overlapping hotspots where hotspots appear opposite each other on both strands. Subsequent recruitment of the low-fidelity DNA repair enzyme, Polymerase eta (Polη), during mismatch repair, creates additional mutations at WA/TW sites. Although there are more than 50 functional immunoglobulin heavy chain variable (IGHV) segments in humans, the fundamental differences between these genes and their ability to respond to all possible foreign antigens is still poorly understood. To better understand this, we generated profiles of WGCW hotspots in each of the human IGHV genes and found the expected high frequency in complementarity determining regions (CDRs) that encode the antigen binding sites but also an unexpectedly high frequency of WGCW in certain framework (FW) sub-regions. Principal Components Analysis (PCA) of these overlapping AID hotspot profiles revealed that one major difference between IGHV families is the presence or absence of WGCW in a sub-region of FW3 sometimes referred to as "CDR4." Further differences between members of each family (e.g., IGHV1) are primarily determined by their WGCW densities in CDR1. We previously suggested that the co-localization of AID overlapping and Polη hotspots was associated with high mutability of certain IGHV sub-regions, such as the CDRs. To evaluate the importance of this feature, we extended the WGCW profiles, combining them with local densities of Polη (WA) hotspots, thus describing the co-localization of both types of hotspots across all IGHV genes. We also verified that co-localization is associated with higher mutability. PCA of the co-localization profiles showed CDR1 and CDR2 as being the main contributors to variance among IGHV genes, consistent with the importance of these sub-regions in antigen binding. Our results suggest that AID overlapping (WGCW) hotspots alone or in conjunction with Polη (WA/TW) hotspots are key features of evolutionary variation between IGHV genes.
体细胞超突变(SHM)是免疫球蛋白可变(IgV)基因座抗体亲和力成熟的关键过程。酶激活诱导脱氨酶(AID)通过在单链 DNA(ssDNA)上创建 C→U 错配来启动 SHM。AID 在 WRC/GYW(W=A/T,R=A/G,Y=C/T)的背景下具有优先热点基序靶标,特别是在 WGCW 重叠热点中,热点出现在两条链上彼此相对。随后,低保真度 DNA 修复酶聚合酶 eta(Polη)的募集在错配修复过程中,在 WA/TW 位点产生额外的突变。尽管人类有 50 多个功能性免疫球蛋白重链可变(IGHV)片段,但这些基因之间的基本差异及其对所有可能的外来抗原的反应能力仍知之甚少。为了更好地理解这一点,我们生成了人类 IGHV 基因中每个基因的 WGCW 热点图谱,并在编码抗原结合位点的互补决定区(CDR)中发现了预期的高频,但在某些框架(FW)亚区中也发现了出乎意料的高频 WGCW。对这些重叠 AID 热点图谱的主成分分析(PCA)表明,IGHV 家族之间的一个主要区别是 FW3 中的一个亚区是否存在 WGCW,该亚区有时被称为“CDR4”。每个家族成员之间的进一步差异(例如,IGHV1)主要取决于其在 CDR1 中的 WGCW 密度。我们之前曾提出,AID 重叠和 Polη 热点的共定位与某些 IGHV 亚区(如 CDR)的高突变率有关。为了评估此特征的重要性,我们扩展了 WGCW 图谱,将其与 Polη(WA)热点的局部密度结合在一起,从而描述了所有 IGHV 基因中这两种热点的共定位。我们还验证了共定位与更高的突变率相关。共定位图谱的 PCA 显示 CDR1 和 CDR2 是 IGHV 基因之间方差的主要贡献者,这与这些亚区在抗原结合中的重要性一致。我们的结果表明,AID 重叠(WGCW)热点单独或与 Polη(WA/TW)热点一起是 IGHV 基因之间进化变异的关键特征。