Suppr超能文献

克服 CRISPR-Cas9 脱靶预测障碍:一种新的方法,结合 ESB 再平衡策略和 CRISPR-MCA 模型。

Overcoming CRISPR-Cas9 off-target prediction hurdles: A novel approach with ESB rebalancing strategy and CRISPR-MCA model.

机构信息

School of Mathematics and Computer science, Zhejiang A&F University, Hangzhou, China.

College of Landscape Architecture, Beijing Forestry University, Beijing, China.

出版信息

PLoS Comput Biol. 2024 Sep 3;20(9):e1012340. doi: 10.1371/journal.pcbi.1012340. eCollection 2024 Sep.

Abstract

The off-target activities within the CRISPR-Cas9 system remains a formidable barrier to its broader application and development. Recent advancements have highlighted the potential of deep learning models in predicting these off-target effects, yet they encounter significant hurdles including imbalances within datasets and the intricacies associated with encoding schemes and model architectures. To surmount these challenges, our study innovatively introduces an Efficiency and Specificity-Based (ESB) class rebalancing strategy, specifically devised for datasets featuring mismatches-only off-target instances, marking a pioneering approach in this realm. Furthermore, through a meticulous evaluation of various One-hot encoding schemes alongside numerous hybrid neural network models, we discern that encoding and models of moderate complexity ideally balance performance and efficiency. On this foundation, we advance a novel hybrid model, the CRISPR-MCA, which capitalizes on multi-feature extraction to enhance predictive accuracy. The empirical results affirm that the ESB class rebalancing strategy surpasses five conventional methods in addressing extreme dataset imbalances, demonstrating superior efficacy and broader applicability across diverse models. Notably, the CRISPR-MCA model excels in off-target effect prediction across four distinct mismatches-only datasets and significantly outperforms contemporary state-of-the-art models in datasets comprising both mismatches and indels. In summation, the CRISPR-MCA model, coupled with the ESB rebalancing strategy, offers profound insights and a robust framework for future explorations in this field.

摘要

CRISPR-Cas9 系统中的脱靶活性仍然是其更广泛应用和发展的一个巨大障碍。最近的进展强调了深度学习模型在预测这些脱靶效应方面的潜力,但它们遇到了重大障碍,包括数据集中的不平衡以及与编码方案和模型架构相关的复杂性。为了克服这些挑战,我们的研究创新性地引入了一种基于效率和特异性的(ESB)类重新平衡策略,专门为仅具有错配的脱靶实例的数据集设计,这是该领域的开创性方法。此外,通过对各种 One-hot 编码方案和多种混合神经网络模型进行细致的评估,我们发现编码和中等复杂程度的模型可以理想地平衡性能和效率。在此基础上,我们提出了一种新的混合模型 CRISPR-MCA,该模型利用多特征提取来提高预测准确性。实证结果证实,ESB 类重新平衡策略在解决极端数据集不平衡问题方面优于五种传统方法,在各种模型中表现出更好的效果和更广泛的适用性。值得注意的是,CRISPR-MCA 模型在四个不同的仅错配数据集的脱靶效应预测方面表现出色,并且在包含错配和插入缺失的数据集方面明显优于当代最先进的模型。总之,CRISPR-MCA 模型与 ESB 重新平衡策略相结合,为该领域的未来探索提供了深刻的见解和强大的框架。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ee88/11398643/72f57217bc3a/pcbi.1012340.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验