Mak Chi H, Pham Phuong, Goodman Myron F
J Phys Chem A. 2019 Apr 4;123(13):3030-3037. doi: 10.1021/acs.jpca.9b00910. Epub 2019 Mar 21.
Activation-induced deoxycytidine deaminase (AID) is a key enzyme in the human immune system. AID binds to and catalyzes random point mutations on the immunoglobulin (Ig) gene, leading to diversification of the Ig gene sequence by random walk motions, scanning for cytidines and turning them to uracils. The mutation patterns deposited by AID on its substrate DNA sequences can be interpreted as random binary words, and the information content of this stochastically generated library of mutated DNA sequences can be measured by its entropy. In this paper, we derive an analytical formula for this entropy and show that the stochastic scanning + catalytic dynamics of AID is controlled by a characteristic length that depends on the diffusion coefficient of AID and the catalytic rate. Experiments showed that the deamination rates have a sequence context dependence, where mutations are generated at higher intensities on DNA sequences with higher densities of mutable sites. We derive an isomorphism between this classical system and a quantum mechanical model and use this isomorphism to explain why AID appears to focus its scanning on regions with higher concentrations of deaminable sites. Using path integral Monte Carlo simulations of the quantum isomorphic system, we demonstrate how AID's scanning indeed depends on the context of the DNA sequence and how this affects the entropy of the library of generated mutant clones. Examining detailed features in the entropy of the experimentally generated clone library, we provide clear evidence that the random walk of AID on its substrate DNA is focused near hot spots. The model calculations applied to the experimental data show that the observed per-site mutation frequencies display similar contextual dependences as observed in the experiments, in which hot motifs are located adjacent to several different types of hot and cold motifs.
活化诱导的胞嘧啶脱氨酶(AID)是人类免疫系统中的一种关键酶。AID与免疫球蛋白(Ig)基因结合并催化其随机点突变,通过随机游走运动使Ig基因序列多样化,扫描胞嘧啶并将其转化为尿嘧啶。AID在其底物DNA序列上沉积的突变模式可解释为随机二进制字,并且这个随机生成的突变DNA序列文库的信息含量可以通过其熵来衡量。在本文中,我们推导出了这个熵的解析公式,并表明AID的随机扫描+催化动力学由一个特征长度控制,该特征长度取决于AID的扩散系数和催化速率。实验表明,脱氨速率具有序列上下文依赖性,即在具有较高可变位点密度的DNA序列上以更高强度产生突变。我们推导出这个经典系统与一个量子力学模型之间的同构关系,并利用这个同构关系来解释为什么AID似乎将其扫描集中在可脱氨位点浓度较高的区域。通过对量子同构系统进行路径积分蒙特卡罗模拟,我们证明了AID的扫描确实如何依赖于DNA序列的上下文以及这如何影响所生成突变克隆文库的熵。通过检查实验生成的克隆文库熵的详细特征,我们提供了明确的证据表明AID在其底物DNA上的随机游走集中在热点附近。应用于实验数据的模型计算表明,观察到的每一位点突变频率显示出与实验中观察到的类似的上下文依赖性,其中热点基序位于几种不同类型的热点和冷点基序附近。