Suppr超能文献

利用图卷积网络预测CRISPR/Cas9基因编辑中sgRNA的脱靶活性

Prediction of sgRNA Off-Target Activity in CRISPR/Cas9 Gene Editing Using Graph Convolution Network.

作者信息

Vinodkumar Prasoon Kumar, Ozcinar Cagri, Anbarjafari Gholamreza

机构信息

iCV Lab, Institute of Technology, University of Tartu, 51009 Tartu, Estonia.

PwC Advisory Finland, 00180 Helsinki, Finland.

出版信息

Entropy (Basel). 2021 May 14;23(5):608. doi: 10.3390/e23050608.

Abstract

CRISPR/Cas9 is a powerful genome-editing technology that has been widely applied in targeted gene repair and gene expression regulation. One of the main challenges for the CRISPR/Cas9 system is the occurrence of unexpected cleavage at some sites (off-targets) and predicting them is necessary due to its relevance in gene editing research. Very few deep learning models have been developed so far to predict the off-target propensity of single guide RNA (sgRNA) at specific DNA fragments by using artificial feature extract operations and machine learning techniques; however, this is a convoluted process that is difficult to understand and implement for researchers. In this research work, we introduce a novel graph-based approach to predict off-target efficacy of sgRNA in the CRISPR/Cas9 system that is easy to understand and replicate for researchers. This is achieved by creating a graph with sequences as nodes and by using a link prediction method to predict the presence of links between sgRNA and off-target inducing target DNA sequences. Features for the sequences are extracted from within the sequences. We used HEK293 and K562 t datasets in our experiments. GCN predicted the off-target gene knockouts (using link prediction) by predicting the links between sgRNA and off-target sequences with an auROC value of 0.987.

摘要

CRISPR/Cas9是一种强大的基因组编辑技术,已广泛应用于靶向基因修复和基因表达调控。CRISPR/Cas9系统的主要挑战之一是在某些位点(脱靶位点)出现意外切割,由于其与基因编辑研究的相关性,对其进行预测很有必要。到目前为止,通过使用人工特征提取操作和机器学习技术来预测特定DNA片段上单导向RNA(sgRNA)的脱靶倾向的深度学习模型非常少;然而,这是一个复杂的过程,研究人员很难理解和实施。在这项研究工作中,我们引入了一种新颖的基于图的方法来预测CRISPR/Cas9系统中sgRNA的脱靶效率,研究人员很容易理解和复制该方法。这是通过创建一个以序列为节点的图,并使用链接预测方法来预测sgRNA与脱靶诱导靶DNA序列之间链接的存在来实现的。序列的特征是从序列内部提取的。我们在实验中使用了HEK293和K562数据集。GCN通过预测sgRNA与脱靶序列之间的链接来预测脱靶基因敲除(使用链接预测),auROC值为0.987。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/88da/8156774/655329b18a11/entropy-23-00608-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验