Suppr超能文献

CASowary:用于转录物耗竭的 CRISPR-Cas13 guide RNA 预测器。

CASowary: CRISPR-Cas13 guide RNA predictor for transcript depletion.

机构信息

Department of BioHealth Informatics, School of Informatics and Computing, Indiana University Purdue University Indianapolis (IUPUI), 535 West Michigan St, Indianapolis, IN, 46202, USA.

Department of Chemistry, The University of Chicago, Chicago, IL, USA.

出版信息

BMC Genomics. 2022 Mar 2;23(1):172. doi: 10.1186/s12864-022-08366-2.

Abstract

BACKGROUND

Recent discovery of the gene editing system - CRISPR (Clustered Regularly Interspersed Short Palindromic Repeats) associated proteins (Cas), has resulted in its widespread use for improved understanding of a variety of biological systems. Cas13, a lesser studied Cas protein, has been repurposed to allow for efficient and precise editing of RNA molecules. The Cas13 system utilizes base complementarity between a crRNA/sgRNA (crispr RNA or single guide RNA) and a target RNA transcript, to preferentially bind to only the target transcript. Unlike targeting the upstream regulatory regions of protein coding genes on the genome, the transcriptome is significantly more redundant, leading to many transcripts having wide stretches of identical nucleotide sequences. Transcripts also exhibit complex three-dimensional structures and interact with an array of RBPs (RNA Binding Proteins), both of which may impact the effectiveness of transcript depletion of target sequences. However, our understanding of the features and corresponding methods which can predict whether a specific sgRNA will effectively knockdown a transcript is very limited.

RESULTS

Here we present a novel machine learning and computational tool, CASowary, to predict the efficacy of a sgRNA. We used publicly available RNA knockdown data from Cas13 characterization experiments for 555 sgRNAs targeting the transcriptome in HEK293 cells, in conjunction with transcriptome-wide protein occupancy information. Our model utilizes a Decision Tree architecture with a set of 112 sequence and target availability features, to classify sgRNA efficacy into one of four classes, based upon expected level of target transcript knockdown. After accounting for noise in the training data set, the noise-normalized accuracy exceeds 70%. Additionally, highly effective sgRNA predictions have been experimentally validated using an independent RNA targeting Cas system - CIRTS, confirming the robustness and reproducibility of our model's sgRNA predictions. Utilizing transcriptome wide protein occupancy map generated using POP-seq in HeLa cells against publicly available protein-RNA interaction map in Hek293 cells, we show that CASowary can predict high quality guides for numerous transcripts in a cell line specific manner.

CONCLUSIONS

Application of CASowary to whole transcriptomes should enable rapid deployment of CRISPR/Cas13 systems, facilitating the development of therapeutic interventions linked with aberrations in RNA regulatory processes.

摘要

背景

最近发现的基因编辑系统——CRISPR(成簇规律间隔短回文重复)相关蛋白(Cas),已被广泛用于更好地理解各种生物系统。Cas13 是一种研究较少的 Cas 蛋白,现已被重新用于高效、精确地编辑 RNA 分子。Cas13 系统利用 crRNA/sgRNA(CRISPR RNA 或单指导 RNA)与靶 RNA 转录本之间的碱基互补性,优先仅与靶转录本结合。与靶向基因组上蛋白编码基因的上游调控区不同,转录组的冗余度显著更高,导致许多转录本具有广泛的相同核苷酸序列。转录本还表现出复杂的三维结构,并与一系列 RBPs(RNA 结合蛋白)相互作用,这两者都可能影响靶序列转录本的耗竭效果。然而,我们对特定 sgRNA 是否能有效敲低转录本的特征及其相应的预测方法的了解非常有限。

结果

在这里,我们提出了一种新的机器学习和计算工具 CASowary,用于预测 sgRNA 的功效。我们使用了来自 Cas13 特征描述实验的公开可用的 RNA 敲低数据,这些数据针对 HEK293 细胞中的转录组,结合了全转录组范围的蛋白质占有率信息。我们的模型利用决策树架构,结合了 112 个序列和靶标可用性特征,根据预期的靶转录本敲低水平,将 sgRNA 的功效分类为四个类别之一。在考虑到训练数据集的噪声后,噪声归一化准确率超过 70%。此外,使用独立的 RNA 靶向 Cas 系统——CIRTS 对高度有效的 sgRNA 预测进行了实验验证,证实了我们模型的 sgRNA 预测的稳健性和可重复性。利用利用 POP-seq 在 HeLa 细胞中生成的全转录组范围的蛋白质占有率图谱,以及在 Hek293 细胞中生成的公开可用的蛋白质-RNA 相互作用图谱,我们表明 CASowary 可以针对特定细胞系中的许多转录本预测高质量的向导。

结论

CASowary 在整个转录组中的应用应该能够快速部署 CRISPR/Cas13 系统,从而促进与 RNA 调控过程异常相关的治疗干预措施的发展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/8889671/a34619ae0bb7/12864_2022_8366_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验