Milicevic Ognjen, Repac Jelena, Bozic Bojan, Djordjevic Magdalena, Djordjevic Marko
School of Medicine, University of Belgrade, Belgrade, Serbia.
Multidisciplinary Ph.D. Program in Biophysics, University of Belgrade, Belgrade, Serbia.
Front Microbiol. 2019 Sep 4;10:2054. doi: 10.3389/fmicb.2019.02054. eCollection 2019.
Inferring transcriptional direction (orientation) of the CRISPR array is essential for many applications, including systematically investigating non-canonical CRISPR/Cas functions. The standard method, CRISPRDirection (embedded within CRISPRCasFinder), fails to predict the orientation (ND predictions) for ∼37% of the classified CRISPR arrays (>2200 loci); this goes up to >70% for the II-B subtype where non-canonical functions were first experimentally discovered. Alternatively, Potential Orientation (also embedded within CRISPRCasFinder), has a much smaller frequency of ND predictions but might have significantly lower accuracy. We propose a novel simple criterion, where the CRISPR array direction is assigned according to the direction of its associated genes (Cas Orientation). We systematically assess the performance of the three methods (Cas Orientation, CRISPRDirection, and Potential Orientation) across all CRISPR/Cas subtypes, by a mutual crosscheck of their predictions, and by comparing them to the experimental dataset. Interestingly, CRISPRDirection agrees much better with Cas Orientation than with Potential Orientation, despite CRISPRDirection and Potential Orientation being mutually related - Potential Orientation corresponding to one of six (heterogeneous) predictors employed by CRISPRDirection - and being unrelated to Cas Orientation. We find that Cas Orientation has much higher accuracy compared to Potential Orientation and comparable accuracy to CRISPRDirection - while accurately assigning an orientation to ∼95% of the CRISPR arrays that are non-determined by CRISPRDirection. Cas Orientation is, at the same time, simple to employ, requiring only (routine for prokaryotes) the prediction of the associated protein coding gene direction.
推断CRISPR阵列的转录方向(取向)对于许多应用至关重要,包括系统地研究非经典CRISPR/Cas功能。标准方法CRISPRDirection(嵌入在CRISPRCasFinder中)无法预测约37%的已分类CRISPR阵列(>2200个位点)的取向(无方向预测);对于首次通过实验发现非经典功能的II-B亚型,这一比例高达>70%。或者,潜在取向(也嵌入在CRISPRCasFinder中)无方向预测的频率要小得多,但准确性可能显著较低。我们提出了一种新颖的简单标准,即根据其相关基因的方向来指定CRISPR阵列的方向(Cas取向)。我们通过对三种方法(Cas取向、CRISPRDirection和潜在取向)的预测进行相互交叉核对,并将它们与实验数据集进行比较,系统地评估了这三种方法在所有CRISPR/Cas亚型中的性能。有趣的是,尽管CRISPRDirection和潜在取向相互关联(潜在取向对应于CRISPRDirection使用的六个(异质)预测因子之一)且与Cas取向无关,但CRISPRDirection与Cas取向的一致性比与潜在取向的一致性要好得多。我们发现,与潜在取向相比,Cas取向具有更高的准确性,与CRISPRDirection的准确性相当,同时能准确地为约95%的CRISPRDirection无法确定取向的CRISPR阵列确定取向。与此同时,Cas取向易于使用,只需要(原核生物的常规操作)预测相关蛋白质编码基因的方向。