MOE Key Laboratory for Industrial Biocatalysis, Institute of Biochemical Engineering, Department of Chemical Engineering, Tsinghua University, Beijing 100084, China.
Tsinghua-Peking Center for Life Sciences, School of Medicine, Tsinghua University, Beijing 100084, China.
Nucleic Acids Res. 2021 Feb 22;49(3):1263-1277. doi: 10.1093/nar/gkaa1295.
As an effective programmable DNA targeting tool, CRISPR-Cas9 system has been adopted in varieties of biotechnological applications. However, the off-target effects, derived from the tolerance towards guide-target mismatches, are regarded as the major problems in engineering CRISPR systems. To understand this, we constructed two sgRNA libraries carrying saturated single- and double-nucleotide mismatches in living bacteria cells, and profiled the comprehensive landscape of in vivo binding affinity of dCas9 toward DNA target guided by each individual sgRNA with particular mismatches. We observed a synergistic effect in seed, where combinatorial double mutations caused more severe activity loss compared with the two corresponding single mutations. Moreover, we found that a particular mismatch type, dDrG (D = A, T, G), only showed moderate impairment on binding. To quantitatively understand the causal relationship between mismatch and binding behaviour of dCas9, we further established a biophysical model, and found that the thermodynamic properties of base-pairing coupled with strand invasion process, to a large extent, can account for the observed mismatch-activity landscape. Finally, we repurposed this model, together with a convolutional neural network constructed based on the same mechanism, as a predictive tool to guide the rational design of sgRNA in bacterial CRISPR interference.
作为一种有效的可编程 DNA 靶向工具,CRISPR-Cas9 系统已被应用于各种生物技术中。然而,脱靶效应源于对向导靶标错配的容忍度,被认为是工程化 CRISPR 系统的主要问题。为了理解这一点,我们在活细菌细胞中构建了两个 sgRNA 文库,这些文库携带饱和的单碱基和双碱基错配,并对每个 sgRNA 引导的特定错配的 dCas9 与 DNA 靶标的体内结合亲和力进行了全面分析。我们观察到在种子区存在协同效应,其中组合双突变导致的活性丧失比两个相应的单突变更严重。此外,我们发现一种特定的错配类型,dDrG(D=A,T,G),仅对结合表现出中等程度的损伤。为了定量理解 dCas9 错配与结合行为之间的因果关系,我们进一步建立了一个生物物理模型,发现碱基配对与链入侵过程的热力学性质在很大程度上可以解释观察到的错配-活性图谱。最后,我们重新利用这个模型,以及基于相同机制构建的卷积神经网络,作为一种预测工具,指导细菌 CRISPR 干扰中 sgRNA 的合理设计。