Mak Jeffrey Kelvin, Bendandi Artemi, Salim José Augusto, Mazoni Ivan, de Moraes Fabio Rogerio, Borro Luiz, Störtz Florian, Rocchia Walter, Neshich Goran, Minary Peter
Department of Computer Science, University of Oxford, Parks Road, Oxford OX1 3QD, United Kingdom.
CONCEPT Lab, Istituto Italiano di Tecnologia, Via Melen - 83, B Block, 16152Genova, Italy.
NAR Genom Bioinform. 2025 May 21;7(2):lqaf054. doi: 10.1093/nargab/lqaf054. eCollection 2025 Jun.
Despite advances in determining the factors influencing cleavage activity of a CRISPR-Cas9 single guide RNA (sgRNA) at an (off-)target DNA sequence, a comprehensive assessment of pertinent physico-chemical/structural descriptors is missing. In particular, studies have not yet directly exploited the information-rich internal protein 3D nanoenvironment of the sgRNA-(off-)target strand DNA pair, which we obtain by harvesting 634 980 residue-level features for CRISPR-Cas9 complexes. As a proof-of-concept study, we simulated the internal protein 3D nanoenvironment for all experimentally available single-base protospacer-adjacent motif-distal mutations for a given sgRNA-target strand pair. By determining the most relevant residue-level features for CRISPR-Cas9 off-target cleavage activity, we developed STING_CRISPR, a machine learning model delivering accurate predictive performance of off-target cleavage activity for the type of single-base mutations considered in this study. By interpreting STING_CRISPR, we identified four important Cas9 residue spatial hotspots and associated structural/physico-chemical descriptor classes influencing CRISPR-Cas9 (off-)target cleavage activity for the sgRNA-target strand pairs covered in this study.
尽管在确定影响CRISPR-Cas9单导向RNA(sgRNA)在(脱)靶DNA序列上切割活性的因素方面取得了进展,但仍缺乏对相关物理化学/结构描述符的全面评估。特别是,研究尚未直接利用sgRNA-(脱)靶链DNA对中富含信息的内部蛋白质三维纳米环境,我们通过收集CRISPR-Cas9复合物的634980个残基水平特征来获得这一环境。作为一项概念验证研究,我们针对给定的sgRNA-靶链对,模拟了所有实验可用的单碱基原间隔序列临近基序远端突变的内部蛋白质三维纳米环境。通过确定与CRISPR-Cas9脱靶切割活性最相关的残基水平特征,我们开发了STING_CRISPR,这是一种机器学习模型,对于本研究中考虑的单碱基突变类型,能够准确预测脱靶切割活性。通过对STING_CRISPR进行解释,我们确定了四个重要的Cas9残基空间热点以及相关的结构/物理化学描述符类别,这些因素影响了本研究涵盖的sgRNA-靶链对的CRISPR-Cas9(脱)靶切割活性。