Rafiq Shehla, Assad Assif
Department of Computer Science and Engineering, Islamic University of Science and Technology, Pulwama, India.
J Comput Biol. 2025 Jun 12. doi: 10.1089/cmb.2025.0031.
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)/Cas9 is a leading genomic editing tool, but its effectiveness is limited by considerable heterogeneity in target efficiency among different single guide RNAs (sgRNA). This study presents RNAS-sgRNA, a hybrid model that integrates neural architecture search (NAS) with recurrent neural networks (RNN) to evaluate the on-target efficacy of CRISPR/Cas9 sgRNA. The RNAS-sgRNA model automates architectural discovery, improving sgRNA sequence categorization without considerable manual adjustment. The NAS component improves the RNN architecture, which analyzes sgRNA sequences represented as binary matrices and produces a classification score. Upon evaluation across several datasets, RNAS-sgRNA exhibits substantial performance enhancements with multiple cell lines, comparing its area under the receiver operating characteristic curve (AUROC) performance to the baseline CRISPRpred(SEQ) and DeepCRISPR models. RNAS-sgRNA demonstrated substantial improvements in AUROC performance in several cell lines compared with existing models. Notable improvements include enhancements of 8.62% for HCT116, 121.57% for HEK293T, 13.40% for HeLa, and 20.78% for HL60 cell lines, resulting in an overall improvement of 13.46%. Compared with DeepCRISPR, the model achieved additional AUROC gains in all cell lines tested, with an average improvement of 14.74%. The study also highlighted the ability of the model to deliver superior performance on smaller datasets through transfer learning, underscoring its potential applications in personalized medicine and genetic research. RNAS-sgRNA introduces a novel integration of NAS with RNN to evaluate the efficacy of CRISPR/Cas9 sgRNA. Unlike traditional methods that require significant manual adjustments, this model automates architectural discovery, optimizing the RNN structure for sgRNA sequence analysis. Furthermore, the application of transfer learning to fine-tune the pretrained model on small cell-line datasets represents a pioneering approach in the domain. The model's demonstrated ability to significantly outperform existing algorithms, including CRISPRpred(SEQ) and DeepCRISPR, across multiple cell lines highlights its innovative contribution to genomic editing research and personalized medicine.
成簇规律间隔短回文重复序列(CRISPR)/Cas9是一种领先的基因组编辑工具,但其有效性受到不同单导向RNA(sgRNA)之间靶点效率存在显著异质性的限制。本研究提出了RNAS-sgRNA,这是一种将神经架构搜索(NAS)与循环神经网络(RNN)相结合的混合模型,用于评估CRISPR/Cas9 sgRNA的靶向效力。RNAS-sgRNA模型实现了架构发现自动化,无需大量人工调整即可改进sgRNA序列分类。NAS组件改进了RNN架构,该架构分析表示为二元矩阵的sgRNA序列并生成分类分数。在多个数据集上进行评估时,RNAS-sgRNA在多种细胞系中表现出显著的性能提升,将其在受试者操作特征曲线下面积(AUROC)的性能与基线CRISPRpred(SEQ)和DeepCRISPR模型进行比较。与现有模型相比,RNAS-sgRNA在多种细胞系的AUROC性能上有显著提升。显著的改进包括HCT116细胞系提升了8.62%,HEK293T细胞系提升了121.57%,HeLa细胞系提升了13.40%,HL60细胞系提升了20.78%,总体提升了13.46%。与DeepCRISPR相比,该模型在所有测试的细胞系中AUROC均有额外提升,平均提升了14.74%。该研究还强调了该模型通过迁移学习在较小数据集上实现卓越性能的能力,突出了其在个性化医学和基因研究中的潜在应用。RNAS-sgRNA引入了一种将NAS与RNN新颖结合的方法来评估CRISPR/Cas9 sgRNA的效力。与需要大量人工调整的传统方法不同,该模型实现了架构发现自动化,为sgRNA序列分析优化了RNN结构。此外,在小细胞系数据集上应用迁移学习对预训练模型进行微调是该领域的一种开创性方法。该模型在多种细胞系中显著优于包括CRISPRpred(SEQ)和DeepCRISPR在内的现有算法的能力,突出了其对基因组编辑研究和个性化医学的创新贡献。