Suppr超能文献

CNN-XG:一种用于 sgRNA 靶标预测的混合框架。

CNN-XG: A Hybrid Framework for sgRNA On-Target Prediction.

机构信息

School of Mathematics and Physics, University of Science and Technology Beijing, Beijing 100083, China.

Basic Experimental Center of Natural Science, University of Science and Technology Beijing, Beijing 100083, China.

出版信息

Biomolecules. 2022 Mar 7;12(3):409. doi: 10.3390/biom12030409.

Abstract

As the third generation gene editing technology, Crispr/Cas9 has a wide range of applications. The success of Crispr depends on the editing of the target gene via a functional complex of sgRNA and Cas9 proteins. Therefore, highly specific and high on-target cleavage efficiency sgRNA can make this process more accurate and efficient. Although there are already many sophisticated machine learning or deep learning models to predict the on-target cleavage efficiency of sgRNA, prediction accuracy remains to be improved. XGBoost is good at classification as the ensemble model could overcome the deficiency of a single classifier to classify, and we would like to improve the prediction efficiency for sgRNA on-target activity by introducing XGBoost into the model. We present a novel machine learning framework which combines a convolutional neural network (CNN) and XGBoost to predict sgRNA on-target knockout efficacy. Our framework, called CNN-XG, is mainly composed of two parts: a feature extractor CNN is used to automatically extract features from sequences and predictor XGBoost is applied to predict features extracted after convolution. Experiments on commonly used datasets show that CNN-XG performed significantly better than other existing frameworks in the predicted classification mode.

摘要

作为第三代基因编辑技术,Crispr/Cas9 具有广泛的应用。Crispr 的成功依赖于 sgRNA 和 Cas9 蛋白功能复合物对靶基因的编辑。因此,高度特异性和高靶标切割效率的 sgRNA 可以使这个过程更加准确和高效。尽管已经有许多复杂的机器学习或深度学习模型来预测 sgRNA 的靶标切割效率,但预测准确性仍有待提高。XGBoost 擅长分类,因为集成模型可以克服单个分类器的不足,从而进行分类,我们希望通过将 XGBoost 引入模型来提高 sgRNA 靶标活性的预测效率。我们提出了一种新的机器学习框架,该框架结合了卷积神经网络(CNN)和 XGBoost 来预测 sgRNA 的靶标敲除效率。我们的框架称为 CNN-XG,主要由两部分组成:特征提取器 CNN 用于自动从序列中提取特征,预测器 XGBoost 用于预测卷积后提取的特征。在常用数据集上的实验表明,在预测分类模式方面,CNN-XG 明显优于其他现有框架。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb3f/8945678/e8241708c03c/biomolecules-12-00409-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验