CAS Key Laboratory of Tropical Plant Resources and Sustainable Use, Yunnan Key Laboratory of Crop Wild Relatives Omics, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Kunming 650223, China.
College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China.
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae066.
CRISPR/Cas9 is a promising RNA-guided genome editing technology, which consists of a Cas9 nuclease and a single-guide RNA (sgRNA). So far, a number of sgRNA prediction softwares have been developed. However, they were usually designed for protein-coding genes without considering that long non-coding RNA (lncRNA) genes may have different characteristics. In this study, we first evaluated the performances of a series of known sgRNA-designing tools in the context of both coding and non-coding datasets. Meanwhile, we analyzed the underpinnings of their varied performances on the sgRNA's specificity for lncRNA including nucleic acid sequence, genome location and editing mechanism preference. Furthermore, we introduce a support vector machine-based machine learning algorithm named CRISPRlnc, which aims to model both CRISPR knock-out (CRISPRko) and CRISPR inhibition (CRISPRi) mechanisms to predict the on-target activity of targets. CRISPRlnc combined the paired-sgRNA design and off-target analysis to achieve one-stop design of CRISPR/Cas9 sgRNAs for non-coding genes. Performance comparison on multiple datasets showed that CRISPRlnc was far superior to existing methods for both CRISPRko and CRISPRi mechanisms during the lncRNA-specific sgRNA design. To maximize the availability of CRISPRlnc, we developed a web server (http://predict.crisprlnc.cc) and made it available for download on GitHub.
CRISPR/Cas9 是一种有前途的 RNA 引导的基因组编辑技术,它由 Cas9 核酸酶和单指导 RNA(sgRNA)组成。到目前为止,已经开发了许多 sgRNA 预测软件。然而,它们通常是为编码蛋白的基因设计的,没有考虑到长非编码 RNA(lncRNA)基因可能具有不同的特征。在这项研究中,我们首先评估了一系列已知的 sgRNA 设计工具在编码和非编码数据集背景下的性能。同时,我们分析了它们在 lncRNA 特异性 sgRNA 方面的不同表现的基础,包括核酸序列、基因组位置和编辑机制偏好。此外,我们引入了一种基于支持向量机的机器学习算法,名为 CRISPRlnc,它旨在模拟 CRISPR 敲除(CRISPRko)和 CRISPR 抑制(CRISPRi)机制,以预测靶标上的靶向活性。CRISPRlnc 结合了配对 sgRNA 设计和脱靶分析,实现了针对非编码基因的 CRISPR/Cas9 sgRNA 的一站式设计。在多个数据集上的性能比较表明,在 lncRNA 特异性 sgRNA 设计中,CRISPRlnc 在 CRISPRko 和 CRISPRi 机制方面均优于现有方法。为了最大限度地利用 CRISPRlnc,我们开发了一个网络服务器(http://predict.crisprlnc.cc),并在 GitHub 上提供下载。