Department of Bionanoscience, Delft University of Technology, Delft, The Netherlands.
Kavli Institute of Nanoscience, Delft, The Netherlands.
Genome Biol. 2021 Sep 30;22(1):281. doi: 10.1186/s13059-021-02495-9.
The adaptive CRISPR-Cas immune system stores sequences from past invaders as spacers in CRISPR arrays and thereby provides direct evidence that links invaders to hosts. Mapping CRISPR spacers has revealed many aspects of CRISPR-Cas biology, including target requirements such as the protospacer adjacent motif (PAM). However, studies have so far been limited by a low number of mapped spacers in the database.
By using vast metagenomic sequence databases, we map approximately one-third of more than 200,000 unique CRISPR spacers from a variety of microbes and derive a catalog of more than two hundred unique PAM sequences associated with specific CRISPR-Cas subtypes. These PAMs are further used to correctly assign the orientation of CRISPR arrays, revealing conserved patterns between the last nucleotides of the CRISPR repeat and PAM. We could also deduce CRISPR-Cas subtype-specific preferences for targeting either template or coding strand of open reading frames. While some DNA-targeting systems (type I-E and type II systems) prefer the template strand and avoid mRNA, other DNA- and RNA-targeting systems (types I-A and I-B and type III systems) prefer the coding strand and mRNA. In addition, we find large-scale evidence that both CRISPR-Cas adaptation machinery and CRISPR arrays are shared between different CRISPR-Cas systems. This could lead to simultaneous DNA and RNA targeting of invaders, which may be effective at combating mobile genetic invaders.
This study has broad implications for our understanding of how CRISPR-Cas systems work in a wide range of organisms for which only the genome sequence is known.
适应性 CRISPR-Cas 免疫系统将过去入侵者的序列存储为 CRISPR 数组中的间隔物,从而提供了将入侵者与宿主联系起来的直接证据。对 CRISPR 间隔物的绘制揭示了 CRISPR-Cas 生物学的许多方面,包括目标要求,如原间隔相邻基序(PAM)。然而,到目前为止,由于数据库中映射的间隔物数量较少,研究受到了限制。
通过使用大量的宏基因组序列数据库,我们从各种微生物中绘制了大约 200,000 个独特的 CRISPR 间隔物中的三分之一,并得出了与特定 CRISPR-Cas 亚型相关的 200 多个独特 PAM 序列的目录。这些 PAMs 进一步用于正确分配 CRISPR 数组的方向,揭示了 CRISPR 重复和 PAM 最后核苷酸之间的保守模式。我们还可以推断出 CRISPR-Cas 亚型特异性的靶向开放阅读框的模板或编码链的偏好。虽然一些 DNA 靶向系统(I-E 型和 II 型系统)偏好模板链并避免 mRNA,但其他 DNA 和 RNA 靶向系统(I-A、I-B 和 III 型系统)偏好编码链和 mRNA。此外,我们发现大量证据表明,不同的 CRISPR-Cas 系统之间共享 CRISPR-Cas 适应机制和 CRISPR 数组。这可能导致对入侵者的同时 DNA 和 RNA 靶向,这可能对对抗移动遗传入侵者有效。
这项研究对我们理解 CRISPR-Cas 系统在仅知道基因组序列的广泛生物中的工作方式具有广泛的意义。