Department of Energy, Joint Genome Institute, Berkeley, CA, USA.
Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
RNA Biol. 2022;19(1):678-685. doi: 10.1080/15476286.2022.2067714. Epub 2021 Dec 31.
Noncoding RNAs with secondary structures play important roles in CRISPR-Cas systems. Many of these structures likely remain undiscovered. We used a large-scale comparative genomics approach to predict 156 novel candidate structured RNAs from 36,111 CRISPR-Cas systems. A number of these were found to overlap with coding genes, including palindromic candidates that overlapped with a variety of Cas genes in type I and III systems. Among these 156 candidates, we identified 46 new models of CRISPR direct repeats and 1 tracrRNA. This tracrRNA model occasionally overlapped with predicted coding regions, emphasizing the importance of expanding our search windows for novel structure RNAs in coding regions. We also demonstrated that the antirepeat sequence in this tracrRNA model can be used to accurately assign thousands of predicted CRISPR arrays to type II-C systems. This study highlights the importance of unbiased identification of candidate structured RNAs across CRISPR-Cas systems.
具有二级结构的非编码 RNA 在 CRISPR-Cas 系统中发挥着重要作用。其中许多结构可能尚未被发现。我们使用大规模的比较基因组学方法,从 36111 个 CRISPR-Cas 系统中预测了 156 种新型候选结构 RNA。其中一些与编码基因重叠,包括与 I 型和 III 型系统中的多种 Cas 基因重叠的回文候选基因。在这 156 个候选基因中,我们确定了 46 个新的 CRISPR 直接重复模型和 1 个 tracrRNA。该 tracrRNA 模型偶尔与预测的编码区域重叠,这强调了扩大我们在编码区域中寻找新型结构 RNA 的搜索窗口的重要性。我们还证明,该 tracrRNA 模型中的反重复序列可用于准确地将数千个预测的 CRISPR 数组分配到 II-C 型系统。本研究强调了在 CRISPR-Cas 系统中无偏鉴定候选结构 RNA 的重要性。