Suppr超能文献

机器学习方法识别 CRISPR/Cas9 活性预测新特征的重要性。

A Machine Learning Approach to Identify the Importance of Novel Features for CRISPR/Cas9 Activity Prediction.

机构信息

Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India.

Yardi School of Artificial Intelligence, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India.

出版信息

Biomolecules. 2022 Aug 16;12(8):1123. doi: 10.3390/biom12081123.

Abstract

The reprogrammable CRISPR/Cas9 genome editing tool's growing popularity is hindered by unwanted off-target effects. Efforts have been directed toward designing efficient guide RNAs as well as identifying potential off-target threats, yet factors that determine efficiency and off-target activity remain obscure. Based on sequence features, previous machine learning models performed poorly on new datasets, thus there is a need for the incorporation of novel features. The binding energy estimation of the gRNA-DNA hybrid as well as the Cas9-gRNA-DNA hybrid allowed generating better performing machine learning models for the prediction of Cas9 activity. The analysis of feature contribution towards the model output on a limited dataset indicated that energy features played a determining role along with the sequence features. The binding energy features proved essential for the prediction of on-target activity and off-target sites. The plateau, in the performance on unseen datasets, of current machine learning models could be overcome by incorporating novel features, such as binding energy, among others. The models are provided on GitHub (GitHub Inc., San Francisco, CA, USA).

摘要

可重编程的 CRISPR/Cas9 基因组编辑工具的日益普及受到了脱靶效应的阻碍。人们一直在努力设计高效的向导 RNA,以及识别潜在的脱靶威胁,但决定效率和脱靶活性的因素仍然不清楚。基于序列特征,以前的机器学习模型在新数据集上表现不佳,因此需要结合新的特征。gRNA-DNA 杂交体和 Cas9-gRNA-DNA 杂交体的结合能估计可以生成性能更好的机器学习模型,用于预测 Cas9 的活性。在有限的数据集上对特征对模型输出的贡献进行分析表明,能量特征与序列特征一样起着决定性的作用。结合能特征对于预测靶活性和脱靶位点是必不可少的。通过结合新的特征,如结合能等,可以克服当前机器学习模型在未见数据集上的性能平台期。模型已在 GitHub(美国加利福尼亚州旧金山的 GitHub Inc.)上提供。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c77b/9405635/685926299cf6/biomolecules-12-01123-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验